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Retroviral Nuclear Material and Nucleotide Fragments Especially 
Associated with Multiple Sclerosis and/or Rheiamatoid Arthritis, for 
Diagnostic, Preventive and Therapeutic Purposes 

Abstract 

Nuclear material, in the isolated or purified state, 
and nucleotide fragment, which includes a nucleotide 
fragment chosen in the group that consists of (i) the 
sequences SEQ ID NO: 112, SEQ ID NO: 114, SEQ ID NO: 117, 
SEQ ID NO: 120, SEQ ID NO: 124, SEQ ID NO: 130, SEQ ID NO: 
141 and SEQ ID NO: 142; (ii) the complementary sequences of 
the sequences (i) ; and (iii) the sequences equivalent to the 
sequences (i) or (ii), in particular the sequences that have 
for an entire series 100 contiguous monomers, at least 50%, 
and preferably at least 70% homology with the sequences (i) 
or (ii) respectively, and their uses for detecting a 
retrovirus associated with multiple sclerosis and/or 
rheumatoid arthritis. 



Multiple sclerosis (MS) is a myelin destroying disease of the 
central nervous system (CNS) whose full cause is still unknown. 

Many studies have supported the hypothesis of a viral etiology 
of the disease, but none of the tested known viruses has turned out 
to be the sought causal agent: a review of the viruses studied 
over many years in MS was done by E. Norrby and R.T. Johnson. 

Recently a retrovirus, different from the known human 
retroviruses, was isolated in patients afflicted with MS. The 
authors were also able to show that this retrovirus could be 
transmitted in vitro, that the MS afflicted patients produced 
antibodies capable of recognizing proteins associated with the 
infection of the leptomeningeal cells of this retrovirus, and that 
the expression of the latter could be strongly stimulated by the 
proximate precocious genes of certain herpes viruses. 

All these results argue in favor of the role of at least one 
unknown retrovirus or of a virus that has reverse transcriptase 
(RT) activity detectable by the method published by H. Perron and 
classified as "LM7 type RT" activity in MS. 



*Number in the margin indicates pagination in the foreign 

text . 




The studies by the applicant allowed us to obtain two 
continuous lines of cells infected by natural isolated cultures 
coming from two different patients afflicted with MS, by a culture 
procedure, such as described in the document WO-A-93 20188, in 
which the content is incorporated by reference to the present 
description. These two lines, derived from cells of human 
choroidal plexus, named LM7PC and PLI-2, were deposited at the 
ECACC on July 22, 1992 and January 8, 1993 respectively, under the 
numbers 92 072201 and 93 010817, in conformity with the 
stipulations of the Budapest Treaty, In addition, the viral 
isolated cultures that have RT activity of the LM7 type were also 
deposited at the ECACC under the general designation of ''strains." 
The ''strain" or isolated culture harbored by the PLI-2 strain, /2 
named POL-2, was filed at the ECACC on July 22, 1992 under the 
number V92072202. The "strain" or isolated culture harbored by the 
line LM7PC, named MS7PG, was filed at the ECACC on January 8, 1993 
under number V93010816. 

From the aforementioned cultures and isolated material, 
characterized by some biological and morphological criteria, we set 
out to characterized the nuclear material associated with the viral 
particles produced in these cultures. 

The portions of the genome already described were used to 
perfect molecular detection tests of the viral genome and some 
immuno-serological tests, using the amino acid sequences coded by 
the nucleotide sequences of the viral genome, in order to detect 
the immune response directed against epitopes associated with the 
infection and/or the viral expression. 

These tools already let us confirm an association between MS 
and the expression of the sequences identified in the previously 
cited patents. However, the viral system discovered by the 
applicant is related to a complex retroviral system. Indeed, the 
sequences found encapsulated in the extra-cellular particles 
produced by the different cell cultures of patients afflicted with 
MS show clearly that there is co-encapsulation of retroviral 
genomes that are related but different from the "wild" retroviral 
genome that produces the infecting viral particles. This 
phenomenon has been observed among the replicative retroviruses and 
endogenous retroviruses belonging to the same family, or even 
heterologous. The notion of endogenous retrovirus is very 
important in the context of our discovery because, in the case of 
MSRV-1, we have observed that some endogenous retroviral sequences 
that include sequences homologous with the MSRV-1 genome exist in 
normal human DNA. The existence of endogenous retroviral elements 73. 
(ERV) , related to MSRV-1 by all or part of their genome, explains 
the fact that the expression of the retrovirus MSRV-1 in human 
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cells can interact with similar endogenous sequences. These 
sequences are found in the case of pathogenic endogenous 
retroviruses and/or infections ones (for example, some ecotropic 
strains of murine leukemia virus), in the case of exogenous 
retrovirus in which the nucleotide sequence can be found partially 
or completely in the form of ERV's, in the genome of the host 
animal (ex. exogenic virus of breast tumor of the mouse transmitted 
by milk) . These interactions consist mainly in (i) a 
transactivation or co-activation of ERVs by the replicative 
retrovirus, (ii) an "illegitimate" encapsulation of RNA related to 
ERVs, or of ERFs -rather cell RNA- that have simply some compatible 
encapsulation sequences, in the retroviral particles produced by 
the expression of the replicative strain, sometimes transmissible 
and sometimes with a characteristic pathogenicity, and (iii) some 
more or less important recombinations among the co-encapsulated 
genomes, especially in the reverse transcription phases, which lead 
to the formation of hybrid genomes, sometimes transmissible and 
sometimes with characteristic pathogenicity. 

Thus, (i) different sequences related to MSRV-1 have been 
found in the purified viral particles; (ii) a molecular analysis of 
the different regions of the MSRV-1 retroviral genome must be done 
by analyzing systematically the co-encapsulated, interfering and/or 
recombined sequences that are generated by infection and/ or 
expression of MSRV-1, and in addition, some clones can have parts 
of defective sequences produced by retroviral replication and 
matrix and/or transcription errors of reverse transcriptase; (iii) 
the families of sequences related to a single retroviral genomic 
region are the supports of global diagnostic detection that can be Ik 
optimized by the identification of unvarying regions among the 
clones expressed and by the identification of reading frames 
responsible for the production of antigenic and/or pathogenic 
polypeptides that can be produced only by a part, or even only one, 
of the clones expressed and under these conditions, the systematic 
analysis of the clones expressed in a region of given gene lets one 
evaluate the frequency of variation and/or recombination of the 
genome MSRV-1 in this region and to define the optimal sequences 
for applications, especially diagnostic; (iv) the pathology caused 
by a retrovirus such as MSRV-1 can be a direct effect of its 
expression and of the proteins or peptides produced due to this 
fact, but also an effect of the activation, the encapsulation, and 
the recombination of related or heterologous genomes and of 
proteins or peptides produced by these events; thus, these genomes 
associated with the expression of and/or the infection by MSRV-1 
are an integral part of the potential pathogenicity of this virus 
and therefore, comprise supports for diagnostic detection and 
particular therapeutic targets. Also, any agent associated with, 
or a CO- factor of these interactions responsible for the 
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pathogenicity in question, such as MSRV-2 or the gliotoxin factor 
described in the patent application published under nxomber FR- 
2,716,198, can participate in the development of an overall and 
very effective strategy of diagnosis, prognosis, therapeutic 
monitoring and/or integrated therapy for MS especially, but also 
for any other disease associated with the same agents. 

In this context, a parallel discovery has been made in another 
autoimmune disease, rheumatoid arthritis (RA) , which was described 
in the French patent application filed under number 95 02960. This 
discovery shows that, by applying methodological approaches similar 
to those that were used in the studies of the applicant for MS, it /I 
was possible to identify a retrovirus expressed in RA that shares 
the sequences described for MSRV-1 in MS and also the co-existence 
of an MSRV-2 associated sequence also described in MS. With 
respect to MSRV-1, the sequences jointly detected in MS and RA 
pertain to the genes pol and gag. Given the present state of 
knowledge, one can associate the described sequences gag and pol 
with the MSRV-1 strains expressed in these two diseases. 

The present patent application has as one goal different 
results, which are supplementary with respect to those already 
protected by the French patent applications: 

■ No. 92 04322 of April 3, 1992, published under No. 2,689,519; 

■ No. 92 13447 of November 3, 1992, published under No. 2,689,521; 

■ No. 92 13443 of November 3, 1992, published under No. 2,689,520; 

■ No. 94 01529 of February 4, 1994, published under No. 2,715,936; 

■ No. 94 01531 of February 4, 1994, published under No. 2,715,939; 

■ No. 94 01530 of February 4, 1994, published under No. 2,715,936; 

■ No. 94 01532 of February 4, 1994, published under No. 2,715,937; 

■ No. 94 14322 of November 24, 1994, published under No. 2,727,428; 

■ No. 94 15810 of December 23, 1994, published under No. 2,728,585; 

And 

■ The patent application WO-97/06260. 

The present invention pertains first to a nuclear material 
that can consist of a retroviral material, in the isolated or 
purified state, which can be comprehended or characterized in 
different ways: 

■ It includes a nucleotide sequence chosen in the group 
that consists of (i) the sequences SEQ ID NO: 112, SEQ ID NO: 114, 
SEQ ID NO: 117, SEQ ID NO: 120, SEQ ID NO: 124, SEQ ID NO: 130, SEQ 
ID NO: 141 and SEQ ID NO: 142; (ii) the sequences that are 
complementary to the sequences of (i) ; and (iii) the sequences /£ 
equivalent to the sequences of (i) or (ii) , in particular the 
sequences that have for the entire series of 100 contiguous 
monomers, at least 50%, and preferably at least 70% homology with 
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the sequences (i) or (ii) respectively; 



■ It codes for a polypeptide that has, for the entire 
contiguous series of at least 30 amino acids, at least 50%, and 
preferably at least 70% homology, with a peptide sequence chosen in 
the group that consists of SEQ ID NO: 113, SEQ ID NO 115, SEQ ID 
NO: 118, SEQ ID NO: 121, SEQ ID NO: 135 and SEQ ID NO: 137; 

■ Its gene pol includes a nucleotide sequence that is 
identical or equivalent to a sequence chosen in the group that 
consists of SEQ ID NO: 112, SEQ ID NO: 124 and their complementary 
sequences ; 

■ The end 5" of its gene pol begins at the nucleotide 1419 
of SEQ ID NO: 130; 

■ Its gene pol codes for a polypeptide that has, for its 
entire continuous series of at least 30 amino acids, at least 50%, 
and preferably at least 70% homology, with the peptide sequence SEQ 
ID NO: 113; 

■ The end 3' of its gene gag ends at the nucleotide 1418 of 
SEQ ID NO: 130; 

■ Its gene env includes a nucleotide sequence identical to 
or equivalent to a sequence chosen in the group that consists of 
SEQ ID NO: 117, and its complementary sequences; 

■ Its gene env includes a nucleotide sequence that begins 
at the nucleotide 1 of SEQ ID NO: 117 and ends at the nucleotide 
233 of SEQ ID NO: 114; 

■ Its gene env codes for a polypeptide that has, for all it 
contiguous series of at least 30 amino acids, at least 505, and 
preferably at least 70% homology, with the sequence SEQ ID NO: 118; 

■ The region U3R of its LTR 3' includes a nucleotide 
sequence that terminates at the nucleotide 617 of SEQ ID NO: 114; 

■ The region RU5 of its LTR 5' includes a nucleotide ll_ 
sequence that begins at the nucleotide 755 of SEQ ID NO: 120 and 

ends at nucleotide 337 of SEQ ID NOT: 141 or SEQ ID NO: 142; 

■ A retroviral nuclear material that includes a sequence 
that begins at nucleotide 755 of SEQ ID NO: 120 and that terminates 
at nucleotide 617 of SEQ ID NO: 114; 
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■ The retroviral nuclear material as defined previously is 
in particular associated with at least one autoimmune disease such 
as multiple sclerosis or rheumatoid arthritis. 

The invention pertains also to a nucleotide fragment that 
meets at least one of the following definitions: 

■ It includes or consists of a nucleotide sequence chosen 
in the group that consists of (i) the sequences SEQ ID NO: 112, SEQ 
ID NO: 114, SEQ ID NO: 117, SEQ ID NO: 120, SEQ ID NO: 124, SEQ ID 
NO: 130, SEQ ID NO: 141 and SEQ ID NO: 142; (ii) the complementary 
sequences of the sequences (i) ; and (iii) the sequences equivalent 
to the sequences of (i) or (ii) , in particular the sequences than 
have for the entire series of 100 contiguous monomers, at least 
505, and preferably at least 70% homology with the sequences (i) or 
{ ii) respectively; 

■ It includes or consists of a nucleotide sequence that 
codes for a polypeptide that has, for the entire contiguous series 
of at least 30 amino acids, at least 50%, and preferably at least 
70% homology, with a peptide sequence chosen in the group that 
consists of SEQ ID NO: 113, SEQ ID NO: 115, SEQ ID NO: 118, SEQ ID 
NO: 121, SEQ ID NO: 135 and SEQ ID NO: 137. 

Other aims of the present invention are the following: 

■ A nuclear probe for the detection of a retrovirus 
associated with multiple sclerosis and/or rheumatoid arthritis, 
capable of hybridizing specifically on any fragment previously /2l 
defined and belonging to the genome of the said retrovirus; it has 
advantageously from 10 to 100 nucleotides, preferably from 10 to 30 
nucleotides ; 

■ A beginning for amplification by polymerization of RNA or 
DNA of a retrovirus associated with multiple sclerosis and/or 
rheumatoid arthritis, which includes a nucleotide sequence 
identical or equivalent to at least one part of the nucleotide 
sequence of a fragment defined previously, especially a nucleotide 
sequence that has for the entire series of 10 contiguous monomers, 
at least 50%, and preferably at least 70% homology with at least 
the said part of the said fragment; preferably the nucleotide 
sequence of a beginning of the invention is chosen among SEQ ID NO: 
116, SEQ ID NO: 119, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 
125, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 
132, and SEQ ID NO: 133; 

■ A RNA or a DNA, and especially a replication and/or 
expression vector, which includes a genomic fragment of the nuclear 
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material or a fragment defined previously; 

■ A peptide coded by any open reading frame belonging to a 
previously defined nucleotide fragment, especially a polypeptide, 
an oligo-peptide for example that forms or includes an antigen 
determinant recognized by the sera of patients infected by the 
MSRV-1 virus, and/ or in which the MSRV-1 virus has been 
reactivated; a preferred peptide includes a sequence that is 
identical, partially or fully, or equivalent to a sequence chosen 
among SEQ ID NO: 113, SEQ ID NO: 115, SEQ ID NO: 118, SEQ ID NO: 
121, SEQ ID NO: 135 and SEQ ID NO: 137; 

■ A diagnostic, preventive, or therapeutic compound /I 
especially for inhibiting the expression of at least one retrovirus 
associated with multiple sclerosis and/or rhe\imatoid arthritis, 
which includes a previously defined nucleotide fragment; 

■ A procedure for detecting a retrovirus associated with 
multiple sclerosis and/or rheumatoid arthritis, in a biological 
sample, which includes the stages that consist of putting RNA^ 
and/or DNA presumed to belong or to come from the said retrovirus 
in contact, or their RNA and/or complementary DNA, with a compound 
that includes a nucleotide fragment as defined earlier. 

Before detailing the invention different terms used in the 
description and the claims will now be defined: 

■ By strain or isolated culture or isolated material we 
mean any infecting and/or pathogenic biological fraction, which 
contains viruses and/or bacteria and/or parasites for example, 
which generate a pathogenic and/or antigenic power, harbored by a 
culture or a living host; as an example, a viral strain according 
to the preceding definition can contain a co-infecting agent, a 
pathogenic unicellular organism; 

■ The term "MSRV" used in the present description 
designates any pathogenic agent and/or infecting agent, associated 
with MS, especially a viral species, the attenuated strains of the 
said viral species, or the interfering defective particles that 
contain encapsulated genomes or even some genomes recombined with 
one part of the MSRV-1 genome, derived from this species. It is 
known that the viruses and particularly the viruses that contain 
RNA have a variability, consecutive especially with some relatively 
high rates of spontaneous mutation, which will be considered 
subsequently for defining the concept of equivalence, 

■ By human virus we mean a virus capable of infecting or of 
being harbored by human beings. 
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■ Considering all the variations and/ or natural or induced 
recombinations, which could be met in practice of the present ; 
invention, the aims of the latter, defined previously and in the 
claims, have been expressed by including the equivalents defined 
subsequently, especially nucleotide or peptide homologous 
sequences, 

■ The variant of a virus or a pathogenic and/or infecting 
agent according to the invention includes at least one antigen 
recognized by at least one antibody directed against at least one 
correspondent antigen of the said virus and/ or the said pathogenic 
and/or infecting agent, and/or a genome in which every part is 
detected by at least one hybridization probe, and/or at least one 
specific nucleotide amplification beginning of the said virus 
and/or pathogenic and/or infecting agent, under specific 
hybridization conditions well known by a man of the art, 

■ According to the invention, a nucleotide fragment or an 
oligo-nucleotide or a polynucleotide is in a series of monomers, or 
a biopolymer, characterized by the informational sequence of 
natural nucleic acids, capable of hybridizing with any other 
nucleotide fragment under predetermined conditions, the series 
capable of containing monomers with different chemical structures 
and ob being produced from one molecule of natural nucleic acid 
and/or by gene recombination and/or by chemical synthesis; a 
nucleotide fragment can be identical to a genome fragment of the 
MSRV-1 virus considered by the present invention, especially a gene 
of the latter, pol or env for example in the case of the said 
virus ; 

■ Thus, a monomer can be a natural nucleotide of nucleic 
acid, in which the constituent elements are a sugar, a phosphate 
group and a nitrogenous base; in RNA the sugar is ribose, in DNA 
the sugar is desoxy-2-ribose; whether it is a question of DNA or^ 
RNA, the nitrogenous base is chosen among adenine, guanine, uracil, 
cytosine, thymine; or the nucleotide can be modified in at least 
one of the three constituent elements; as an example, the ^ 
modification can occur at the level of the bases, generating 
modified bases such as inosine, methyl- 5 -desoxycytidine, 
desocyuridine, dimethyl amino-5-desoxyuridine, diamino-2 , 6 -purine, 
bromo-5-desoxyuridine and any other modified base that promotes the 
hybridization; at the sugar level, the modification can consist in 
the replacement of at least one desoxyribose by a polyamide, and at 
the level of the phosphate group, the modification can consist of 
its replacement by some esters, especially chosen among the esters 
of diphosphate, alkyl, and arylphosphonate and phosphorothioate , 
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■ By "informational sequence" we mean any ordered series of 
monomers whose chemical nature and order in a reference direction 
comprise or not functional information of the same quality as that 
of natural nucleic acids, 

■ By hybridization we mean the process during which, under 
appropriate operating conditions, two nucleotide fragments that 
have sufficiently complementary sequences pair to form a complex 
structure, especially a double or triple one, preferably in the 
form of a helix, 

■ A probe includes a nucleotide fragment synthesized by 
chemical means or obtained by digestion or enzymatic cutting of a 
longer nucleotide fragment, which includes at least six monomers, 
advantageously from 10 to 100 monomers, preferably from 10 to 30 
monomers, and having hybridization specificity under specific 
conditions; preferably, a probe that has less than 10 monomers is 
not used alone, but is used in the presence of other probes with 
size just as short or not; under certain particular conditions it 
could be useful to use probes of size greater than 100 monomers; a 
probe can in particular be used for diagnostic purposes and in this 
case one will use capture and/or detection probes, for example, 

■ The capture probe can be immobilized on a solid support 
by any suitable means, that is directly or indirectly, by covalence 
or passive adsorption for example, 

■ The detection probe can be marked by means of a marker 
chose especially among radioactive isotopes, enzymes especially 
chosen among peroxidase and alkaline phosphatase and those capable 
of hydro lyzing a chromogenic, fluorigenic or liominescent substrate, 
chromophoric chemical compounds, chromogenic, fluorigenic or 
luminescent compounds, analogues of nucleotide bases, and biotin, 

■ The probes used for diagnostic purposes of the invention 
can be put to work in all the known hybridization techniques, and 
especially the techniques called "DOT-BLOT," "SOUTHERN BLOT," 
"NORTHERN BLOT, " which is a technique identical to the "SOUTHERN 
BLOT" technique but which uses RNA as the target, the SANDWICH 
technique; advantageously one uses the SANDWITCH technique in the 
present invention, which includes a specific capture probe and/or a 
specific detection probe, it being understood that the capture 
probe and the detection probe must have a nucleotide sequence at 
least partially different, 

■ Any probe according to the present invention can be 
hybridized in vivo or in vitro on RNA and/or on DNA, to block the 
replication phenomena, especially translation and/or transcription. 
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and/ or to degrade the said DNA and/ or RNA, 

■ An initiator is a probe that includes at least six / 13 
monomers, and advantageously from 10 to 30 monomers, which have a 
hybridization specificity under specific conditions, for the 
initiation of an enzymatic polymerization, in an amplification 
technique for example such as PCR (polymerase chain reaction) , in 

an elongation process, such as sequencing, in a method of reverse 
transcription or similar method, 

■ Two nucleotide or peptide sequences are called equivalent 
or derived with respect to one another, or with respect to a 
reference sequence, if functionally the corresponding biopolymers 
can play approximately the same role, without being identical, vis- 
a-vis the application or use in question, or in the technique in 
which they occur; two sequences obtained due to the natural 
variability, especially spontaneous mutation of the species from 
which they have been identified, or induced, as well as two 
homologous sequences, the homology being defined subsequently, are 
equivalent in particular, 

■ By "variability" we mean any modification, spontaneous or 
induced by a sequence, especially by substitution, and/or 
insertion, and/or deletion of nucleotides and/or nucleotide 
fragments, and/or extension and/or shortening of the sequence at 
least at one of the ends; a non-natural variability can result from 
gene engineering techniques used, from the choice of synthesis 
initiators for example, degenerated or not, retained to amplify a 
nucleic acid; this variability can be conveyed by modifications of 
any initial sequence, considered as the reference, and capable of 
being expressed by a degree of homology with respect to the said 
reference sequence , 

■ The homology characterizes the degree of identity of two 
nucleotide or peptide fragments that are compared; it is measured 

by the identity percentage that is especially determined by direct / 14 
comparison of nucleotide or peptide sequences, with respect to 
reference nucleotide or peptide sequences, 

■ Any nucleotide fragment is called equivalent or derived 
from a reference fragment if it has a nucleotide sequence 
equivalent to the sequence of the reference fragment; based on the 
preceding definition the following in particular are equivalent to 
a reference nucleotide fragment: 

(a) Any fragment capable of hybridizing at least partially 
with the complement of the reference fragment. 
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(b) Any fragment in which the alignment with the reference 
fragment results in revealing identical contiguous bases, with 
niimber greater that with any other fragment coming from another 
taxonomic group , 

(c) Any fragment that results from or could result from the 
natural variability of the species, from which it is obtained, 

(d) Any fragment that could result from gene engineering 
techniques applied to the reference fragment, 

(e) Any fragment that includes at least eight continuous 
nucleotides that code for a peptide that is homologous or identical 
to the peptide coded by the reference fragment, 

(f) Any fragment different from the reference fragment, by 
insertion, deletion, substitution of at least one monomer, 
extension, or shortening at least at one of its ends; for example, 
any fragment corresponding to the reference fragment, flanked at 
least at one of its ends by a nucleotide sequence that does not 
code for a polypeptide, 

■ By polypeptide we mean especially any peptide with at 
least two amino acids, especially an oligopeptide, protein, 
extract, separated, or substantially isolated or synthesized, by 
human intervention, especially those obtained by chemical 7 15 
synthesis, or by expression into a recombinant organism, 

■ By polypeptide coded in a partial manner by a nucleotide 
fragment we mean a polypeptide that has at least three amino acids 
coded by at least nine contiguous monomers included in the said 
nucleotide fragment, 

■ An amino acid is said to be an analogue of another amino 
acid when their respective physico-chemical characteristics such as 
polarity, hydrophobicity , and/or basicity, and/or acidity, and/or 
neutrality, are approximately the same; thus, a leucine is 
analogous to an isoleucine, 

■ Any polypeptide is said to be equivalent to or derived 
from a reference polypeptide if the compared polypeptides have 
approximately the same properties, and especially the same 
antigenic, immunological, enzymological properties, and the 
property of molecular recognition; it is especially equivalent to a 
reference polypeptide: 

(a) Any polypeptide that has a sequence in which at least one 
amino acid has been substituted by an analogous amino acid. 
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(b) Any polypeptide that has an equivalent peptide sequence, 
obtained by natural or induced variation of the said reference 
polypeptide, and/or the nucleotide fragment that codes for the said 
polypeptide, 

(c) A mimotope of the said reference polypeptide, 

(d) Any polypeptide in the sequence of which one or several 
amino acids of the series L are replaced by an amino acid of the 
series D, and vice versa, 

(e) Any polypeptide in the sequence of which a modifications 
of the lateral chains of the amino acids has been introduced, such 
as for example an acetylation of the amine functions, a 
carboxylation of the thiol functions, an esterif ication of the 
carboxylic functions, 

(f) Any polypeptide in the sequence of which one of the 
peptide bonds has been modified, as for example the carba, retro, 
inverse, retro- inverse, reduced, and methylene-oxy bonds, 

(g) Any polypeptide in which at least one antigen is 
recognized by antibodies directed against a reference polypeptide, 

■ The percentage of identity that characterizes the 
homology of two compared peptide fragments is at least 505 and 
preferably at least 70% according to the present invention. 

Since a virus that has reverse transcriptase enzymatic 
activity can be genetically characterized just as easily in the RNA 
for as the DNA form, we should also make mention of DNA as well as 
viral RNA to characterize the sequences relative to a virus that 
has such reverse transcriptase activity, called MSRV-1 according to 
the present description. 

The expressions of order used in the present description and 
the claims, such as ''first nucleotide sequence" are not retained to 
express a particular order, but to define more clearly the 
invention. 

By detection of a substance or agent we mean hereafter also an 
identification , a quantification, or a separation or isolation of 
the said substance or the said agent. 

The invention will be better understood from reading the 
detailed description that follows in reference to the attached 
figures in which: 
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Figure 1 shows the general structure of proviral DNA and 
genomic RNA of MSRV-1. 

Figure 2 shows the nucleotide sequence of the clone named CL6- 
5' (SEQ ID No: 112) and three potential amino acid reading frames 
that are included under the nucleotide sequence, /H 

Figure 3 shows the nucleotide sequence of the clone named CL6- 
3' (SEQ ID No: 114) and three potential amino acid reading frames 
that are included under the nucleotide sequence. 

Figure 4 shows the nucleotide sequence of the clone called C15 
(SEQ ID No: 117) and three potential amino acid reading frames that 
are included under the nucleotide sequence. 

Figure 5 shows the nucleotide sequence of the clone named 5M6 
(SEQ ID No: 120) and three potential amino acid reading frames that 
are included under the nucleotide sequence. 

Figure 6 shows the nucleotide sequence of the clone named CL2 
(SEQ ID No: 130) and three potential amino acid reading frames that 
are included under the nucleotide sequence. 

Figure 7 shows three potential amino acid reading frames 
expressed by pET28C-clone 2 and that are included under the 
nucleotide sequence . 

Figure 8 shows three potential amino acid reading frames 
expressed by pER21C-clone 2 and that are included under the 
nucleotide sequence . 

Figure 9 shows the nucleotide sequence of the clone named LB13 
(SEQ ID No: 141) and three potential amino acid reading frames that 
are included under the nucleotide sequence. 

Figure 10 shows the nucleotide sequence of the clone named LA 
15 (SEQ ID No: 142) and three potential amino acid reading frames 
that are included under the nucleotide sequence. 

Figure 11 shows the nucleotide sequence of the clone named 
LB16 (SEQ ID No: 124) and three potential amino acid reading frames 
that are included under the nucleotide sequence. 

EXAMPLE 1 : PRODUCTION OF CL6-5' REGION THAT CODES FOR THE - /IR 
TERMINAL END OF THE INTEGRASE AND OF A REGION CL6-3 ' 
THAT CONTAINS THE TERMINAL SEQUENCE 3' OF THE GENOME 
MSRV-1 
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A 3 'RACE strain was carried out on total RNA extracted from 
plasma of a patient afflicted with MS. A healthy control plasma, 
treated under the same conditions, was used as a negative control. 
The synthesis of cDNA was accomplished with a dT oligo primer 
identified by SEQ ID No : 68 (5' GAC TCG CTG CAG ATC GAT TTT TTT TTT 
TTT TTT T 3 ' ) and the reverse transcriptase Expand® RT" of 
Boehringer according to the conditions recommended by the company. 
A polymerase chain reaction (PGR) was carried out with the enzyme 
Klentaq (Clontech) under the following conditions: 94 degrees C for 
5 minutes then 93 degrees for 1 minute, 58 degrees for 1 minute, 68 
degrees for 3 minutes during 40 cycles and 68 degrees for 8 
minutes, with a final reaction volume of 50 //I. 

Primers used for the PGR: 

■ Primer 5', identified by SEQ ID No: 69 5' GCC ATC AAG CCA CCC 
AAG AAC TCT TAA CTT 3 ' : 

■ Primer 3', identified by SEQ ID No: 68. 

A second PGR called ''semi-strain" was carried out with a 
primer 5' located inside the region already amplified. This second 
PGR was carried out under the same experimental conditions as those 
used during the first PGR, using 10 fzl of amplification product 
derived from the first PGR. 

Primers used for the semi-strain PGR: 

■ Primer 5', identified by SEQ ID No: 70 

■ 5 ' CCA ATA GCC AGA CCA TTA TAT ACA GTA ATT 3 ' ' 

■ Primer 3', identified by SEQ ID No: 68. 

The primers SEQ ID No: 69 and SEQ ID No: 70 are specific for 
the region pol of MSRV-1. 

An amplification product of 1.9 Kb was obtained for the plasma 
of the MS patient. The corresponding fragment was not observed for 
the healthy control plasma. This amplification product was cloned 7 19 
in the following way: 

The amplified DNA was inserted into a plasmid by means of the 
TA Cloning® kit. The 2 //I of DNA solution were mixed with 5 of 
sterile distilled water, 1 microliter of a 10-fold concentrated 
bonding buffer solution ''lOX Ligation Buffer", 2 fxl of ''pGR® VECTOR 
(25 ng/ml) and 1 microliter of ''T4 DNA LIGASE." This mixture was 
incubated overnight at 12 degrees G. The following stages were 
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carried out in conformity with the instruction of the TA Clonxng® 
kit (invitrogen) . At the end of the procedure, the white colonies 
of recombinant bacteria (white) were subcultured in order to be 
grown and allow the extraction of the incorporated plasmids 
according to the so-called "miniprep" procedure. The plasmid 
preparation of each recombinant colony was interrupted by a 
suitable restriction enzyme and analyzed on agar gel. The plasmids 
that have an insert detected under ultraviolet light after marking 
of the gel with ethidium bromide were selected for the sequencing 
of the insert, following hybridization with a complementary primer 
of the promoter Sp6 present on the cloning plasmid of the TA 
cloning kit®. The reaction prior to the sequencing was then 
carried out according to the method recommended for the use of the 
sequencing kit "PRISM® Ready Reaction AmpliTaq® FS, DyeDeoxy® 
Terminator" (Applied Biosystems, ref . 402119) and the automatic 
sequencing was accomplished on the devices 373 A and 377 Applied 
Biosystems, according to the instructions of the manufacturer. 

The resulting clone contains a region CL6-5' that codes for 
the N terminal end of the integrase and a region CL6-3;, which 
corresponds to the terminal region 3' of MSRV-1 and which allows 
one to define the end of the envelope (234 pb) and the regions U3, 
R (401 pb) of the retrovirus MSRVl. 

The region corresponding to the N terminal end of the 
integrase is represented by its nucleotide sequence (SEQ ID No: 
112) in Fig 1 The three potential reading frames are presented I 111 
by their amino acid sequence under the nucleotide sequence, and the 
amino acid sequence of the N terminal end of the integrase is 
identified by SEQ ID No: 113. 

The region C16-3 ' is represented by its nucleotide sequence 
(SEQ ID No: 114) in Fig. 3. The three potential reading frames are 
presented by their amino acid sequence under the nucleotide 
sequence. An amino acid sequence that corresponds _ to the C- 
terminal end of the protein env of MSRV-1 is identified by SEQ ID 
No: 115. 

EXAMPLE2- PRODUCTION OF THE CLONE CIS THAT CONTAINS THE REGION 
- " " • CODES FOR ONE PART OF THE ENVELOPE OF THE 

RETROVIRUS MSRV-1 

A RT-PCR was carried out on the total RNA extracted from _ 
virions concentrated by ultra-centrifuging from the surface fluid 
of a culture of synoviocytes coming from a RA patient. The 
synthesis of cDNA was carried out with a primer dT oligo and the 
reverse transcriptase "Expand® RT" of Boehringer according to the 
conditions recommended by the company. A PCR was carried out with 
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the Expand® Long Template PGR System (Boehringer) under the 
following conditions: 94 degrees C for 5 minutes then 93 degrees 
for 1 minute, 60 degrees C for 1 minute, 68 degrees C for 3 minutes 
during 40 cycles and 68 degrees C for 8 minutes and with a final 
reaction volume for the PGR: 

■ Primer 5', identified by SEQ ID No: 69 

■ 5 ' GGG ATG AAG GGA GGG AAG AAG TGT TAA GTT 3 ' ; 

■ Primer 3', identified by SEQ ID NO: 116 

■ 5 ' TGG GGT TGG ATT TGT AAG AGC ATG TGT AGG TT 3 ' 

A second PGR called ''semi-strain" was carried out with a 
primer 5' located inside the region already amplified. This second 
PGR was carried out under the same experimental conditions as those 
used during the first PGR (except that 30 cycles were carried out 7 21 
in place of 40), using 10 a^I of the amplification product derived 
from the first PGR. 

Primers used for the semi-strain PGR: 

■ Primer 5', identified by SEQ ID No: 70 

■ 5 ' GGA ATA GGG AGA GGA TTA TAT AGA GTA ATT 3 ' ; 

■ Primer 3', identified by SEQ ID No: 116 

The primers SEQ ID No: 69 and SEQ ID No: 70 are specific for 
the region pol of MSRV-1. The primer SEQ ID NO: 116 is specific 
for the sequence FBdl3 (also named B13) and is localized in the env 
region preserved among the onco-retroviruses . 

An amplification product of 1932 pb was obtained and cloned in 
the following way: The amplified DNA was inserted in a plasmid by 
the help of the TA Cloning® kit. The different stages were carried 
out in conformity with the instructions of the TA Cloning® kit 
(Invitrogen) . At the end of the procedure the white colonies of 
recombinant bacteria (white) were subcultured to be grown and to 
allow the extraction of the incorporated plasmids according to the 
so-called "miniprep" procedure. The preparation of plasmid of each 
recombinant colony was interrupted by a suitable restriction enzyme 
and analyzed on agar gel. The plasmids that have an insert 
detected under ultra-violet light after marking of the gel in 
ethidium bromide, were selected for the sequencing of the insert, 
following hybridization with a complementary primer of the promoter 
SP6 present on the cloning plasmid of the TA cloning kit®. The 
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reaction prior to the sequencing was then carried out according to 
the method recommended for the use of the PRISM® Ready Reaction 
AmpliTaq® FS, DyeDeoxy(S> Terminator" (Applied Biosystems, ref . 
402119) and the automatic sequencing was carried out one the 
devices 373 A and 377 Applied Biosystems, according to the 
instructions of the manufacturer. 

The clone C15 obtained contains a region that corresponds to /22 
the region of the envelope of MSRV-l,with 1481 pb. 

The region env of the clone C15 is represented by its 
nucleotide sequence (SEQ ID No: 117) in Fig. 5. The three 
potential reading frames of this clone are presented by their amino 
acid sequence under the nucleotide sequence. The reading frame 
corresponding to a structural env protein of MSRV-1 is identified 
by SEQ ID No: 118. 

EXAMPLE 3 : PRODUCTION OF A CLONE 5M6 THAT CONTAINS THE SEQUENCES 
OF THE TERMINAL 3' REGION OF THE ENVELOPE, FOLLOWED BY 
THE SEQUENCES U3, R, U5 OF THE MSRV-1 PROVIRAL TYPE. 

A single-direction PCR was carried out on DNA extracted from 
B-lymphocytes immortalized in the culture of a RA patient. The PCR 
was carried out with Expand® Long Template PCR System (Boehringer) 
under the following conditions: 94 °C for 3 minutes then 93 °C for 
1 minute for 10 cycles, then 93 ''C for 1 minute, 60 °C for 1 minute 
with 15 seconds extension for each cycle, 68 degrees C for 3 
minutes for 35 cycles and 68 °C for 7 minutes and with a final 
reaction volume of 50 //I. 

The primer used for the PCR identified by SEQ ID No: 119 is 5' 
TCA AAA TCG AAG AGC TTT AGA CTT GCT AAC CG 3 ' ; 

The primers SEQ ID NO: 119 is specific for the region env of 
the clone C15. 

An amplification product of 1673 pb was obtained and cloned in 
the following way: The amplified DNA was inserted in a plasmid by 
the help of the TA Cloning® kit. The different stages were carried 
out in conformity with the instruction of the TA Cloning® kit 
(Invitrogen) . At the end of the procedure the white colonies of 
recombinant bacteria (white) were cultured again in order to be 
cultivated and to allow the extraction of the plasmids IUl 
incorporated according to the so-called ^^miniprep" procedure. The 
preparation of plasmid of each recombinant colony was cut by a 
suitable restriction enzyme and analyzed on agar gel. The plasmids 
that have an insert detected under ultra-violet light after marking 
of the gel with ethidium bromide were selected for the sequencing 
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of the insert, after hybridization with a complementary primer of 
the promoter T7 present on the cloning plasmid of the TA cloning 
kit®. The reaction prior to the sequencing was then carried out 
according to the method recommended for the use of the "PRISM® 
sequencing kit Ready Reaction AmpliTaq® FS, DyeDeoxy® Terminator" 
(Applied Biosystems, ref . 402119) and the automatic sequencing was 
carried out on the devices 373 a and 377 Applied Biosystems, 
according to the instructions of the manufacturer. 

The resulting clone 5M6 contains a region that corresponds to 
the region 3' of the envelope of MSRV-1, with 492 pb followed by 
regions U3, R and U5 (837 pb) of MSRVl . 

The clone 5M6 is represented by its nucleotide sequence (SEQ 
ID No: 120) in Fig, 7. The three potential reading frames of this 
clone are presented by their amino acid sequence under the 
nucleotide sequence. The reading frame corresponding to the C- 
terminal end of the protein env MSRV-1 is identified by SEQ ID No: 
121. 

EXAMPLE 4 : PRODUCTION OF THE CLONE LB16 THAT CONTAINS THE REGION 
THAT CODES THE INTEGRASE OF THE MSRV-1 RETROVIRUS. 

An RT-PCR was carried out on the total RNA treated with the 
DNAsel and extracted from a plexus choroideus that comes from an MS 
patient. The synthesis of cDNA was carried out with a dT oligo 
primer and the ''Expand® RT" reverse transcriptase of Boehringer 
according to the conditions recommended by the company. A "no RT" 
control was carried out at the same tine on the same material. A 7 24 
PCR was carried out with the Taw polymerase (Perkin Elmer) under 
the following conditions: 95 °C for 5 minutes then 95 °C for 1 
minute, 55 °C for 1 minute, 72 °C for 2 minutes during 35 cycles 
and 72 °C for 8 minutes and with a final reaction volume of 50 //I. 

Primers used for the PCR: 

" Primer 5', identified by SEQ ID No: 122 

■ 5 ' GGC ATT GAT AGC ACC CAT CAG 3 ' ; 

■ Primer 3' , identified by SEQ ID No: 123 

■ 5' CAT GTC ACC AGG GTG GAA TAG 3' 

The primer SEQ ID No: 122 is specific for the region pol of 
MSRV-1 and more precisely similar to the integrase region described 
previously. The primer SEQ ID No 123 has been defined on some 
sequences of clones obtained during prior tests. 



19 



An amplification product of about 760 pb was obtained only in 
the test with RT and was cloned in the following way: 

The amplified DNA was inserted in a plasmid by the help of the 
TA Cloning® kit. The different stages were carried out in 
conformity with the instructions of the TA Cloning® kit 
(Invitrogen) . At the end of the procedure the white colonies of 
recombinant bacteria (white) were subcultured in order to be 
cultivated and to allow the extraction of the plasmids incorporated 
according to the so-called "miniprep" procedure. The preparation 
of plasmid of each recombinant colony was cut off by a suitable 
restriction enzyme and analyzed on agar gel. The plasmids that 
have an insert detected under ultra-violet light after marking of 
the gel in ethidium bromide were selected for the sequencing of the 
insert, following hybridization with a complementary primer of the 
promoter T7 present on the cloning plasmid of the TA cloning kit®. 
The reaction prior to the sequencing was then carried out according 
to the method recommended for the use of the ''PRISM® Ready Reaction 
AmpliTaq® FS DyeDeoxy® Terminator" sequencing kit (Applied 
Biosystems, ref . 402119) and the automatic sequencing was carried 7 25 
out one the devices 373 A and 377 Applied Biosystems, according to 
the instructions of the manufacturer. 

The clone LB16 produced contains the sequences corresponding 
to the integrase. The nucleotide sequence of this clone is 
identified by SEQ ID No: 124 in Fig. 11, three reading frames were 
determined. 

EXAMPLE 5 : PRODUCTION OF A CLONE 2, CL2 , WHICH CONTAIN AT 3 ' A 
PART TfiAT IS HOMOLOGOUS TO THE GENE POL, WHICH 
CORRESPONDS TO THE PROTEASE GENE, AND TO THE GENE GAG 
(GM3) THAT CORRESPONDS TO THE NUCLEOCAPSIDE, AND A NEW 
CODING REGION 5' THAT CORRESPONDS TO THE GENE GAG MORE 
SPECIFICALLY THE MATRIX AND THE CAPSIDE OF MSRV-1. 

An amplification by PCR was carried out on total RNA extracted 
from 100 fxl of plasma of a patient afflicted with MS. A water 
control, treated under the same conditions, was used as a negative 
control. The synthesis of cDNA was carried out with 300 pmole of a 
random primer (GIBSO-BRL, France) and the reverse transcriptase 
''Expand RT" (Boehringer Mannheim, France) according to the 
conditions recommended by the company. An amplification by PCR was 
carried out with the enzyme Taw polymerase (Perkin Elmer, France) 
using 10 //I of cDNA under the following conditions: 94 °C for 2 
minutes, 55 °C for 1 minute, 72 degrees for 2 minutes then 94 °C 
for 1 minute, 55 °C for 1 minute, 72 °C for 2 minutes during 30 
cycles and 72 °C during 7 minutes and with a final reaction voliome 
of 50 //I. 
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Primers used for the amplification by PGR: 

■ Primer 5', identified by SEQ ID No: 126 

■ 5 ' CGG ACA TCC AAA GTG ATG GGA AAC G 3 ' ; 

■ Primer 3', identified by SEQ ID NO: 127 

" 5' GGA CAG GAA AGT AAG ACT GAG AAG GC 3' 

A second amplification by PGR called ''semi-strain" was 7 26 
carried out with a primer 5' located inside the region already 
amplified. This second PGR was carried out under the same 
experimental conditions as those used during the first PGR, using 
10 IJ.1 of the amplification product derived from the first PGR. 

Primers used for the amplification by PGR semi-strain: 

■ Primer 5', identified by SEQ ID No: 128 

■ 5 ' GGT AGA AGG TAT TGT GGA GAA TTG GG 3 ' ; 

■ Primer 3', identified by SEQ ID No: 129 

■ 5 ' TGG GTG TGA ATG GTG AAA GAT ACG GG 3 ' 

The primers SEQ ID No: and SEQ ID No: are specific for the 
region pol, clone G+E+A, more specifically the region E: nucleotide 
position No 423 to no. 448. The primers used in the region 5' were 
defined on some sequences of clones obtained during prior tests. 

An amplification product of 1511 pb was obtained from the RNA 
extracted from the plasma of an MS patient. The corresponding 
fragment was not observed for the water control. This 
amplification product was cloned in the following way. 

The amplified DNA was inserted in a plasmid by the help of the 
TA Gloning® kit. Two //I of the DNA solution were mixed with 5 ^1 of 
sterile distilled water, 1 //I of a binding buffer solution 
concentrated lOX "lOX Ligation Buffer," 2 //I of "PGR® VEGTOR" (25 
ng/ml) and 1 microliter of "T4 DNA LIGASE." This mixture was 
incubated overnight at 14 °G. The following stages were carried 
out in conformity with the instructions of the TA Gloning® kit 
(Invitrogen) . The mixture was spread out after transformation of 
the ligation in some E. coli INVaF' bacteria. At the end of the 
procedure the white colonies of recombinant bacteria were 
subcultured in order to be cultivated and to allow the extraction 
of the plasmids incorporated according to the so-called 7 27 
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''minipreparation of DNA" procedure (17) . The plasmid preparation 
of each recombinant colony was cut by the restriction enzyme Eco RI 
and analyzed on agar gel . The plasmids that have an insert 
detected under ultra-violet light after marking of the gel with 
ethidium bromide were selected for the sequencing of the insert, 
after hybridization with a complementary primer of the promoter T7 
present on the cloning plasmid of the TA Cloning kit®. The 
reaction prior to the sequencing was then carried out according to 
the method recommended for the use of the "PRISM® Ready Reaction 
Amplitaq®FS, DeyDeoxy® Terminator" sequencing kit (Applied 
Biosystems, ref , 402119) and the automatic sequencing was carried 
out on the devices 373 A and 377 Applied Biosystems, according to 
the instructions of the manufacturer. 

The resulting clone, named CL2, contains a C-terminal region 
similar to the terminal region 5' of the clones G+E+A of MSRV-1, 
which allows one to define the C-terminal region of the gene gag 
and a new region corresponding to the N-terminal region of the gene 
gag of MSRV-1. 

CL2 allows one to define a region of 1511 pb that has an open 
phase of reading in the N-terminal region of 1077 pb that codes for 
359 amino acids and one non-open phase of reading, of 454 pb, which 
corresponds to the C-terminal region of the gene gag of MSRV-1. 

The nucleotide sequence of CL2 is identified by SEQ ID No: 
130. It is represented in figure XX3,1, with the potential amino 
acid reading frames. 

The fragment of 1077 pb of CL2 that codes for 359 amino acids 
was amplified by PCR with the enzyme Pwo (5U/microliter ) 
(Boehringer Mannheim, France) using 1 microliter of the DNA 
minipreparation of the clone 2 under the following conditions: 95 
°C for 1 minute, 60 °C for 1 minute, 72 °C for 2 minutes during 25 
cycles and with a final reaction volume of 50 /^l by the help of I2Sl 
the primers : 

■ Primer 5' (Bam HI), identified by SEQ ID No: 132 

■ 5' TGC TGG AAT TCG GGA TCC TAG AAC GTA TTC 3' (30 mer) , and 

■ Primer 3' (Hind III), identified by SEQ ID No: 133 

■ 5 AGT TCT GCT CCG AAG CTT AGG CAG ACT TTT 3' (30 mer) that 
correspond, respectively, to the nucleotide sequence of the clone 2 
in position -9 to 21 and 1066 to 1095. 
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The fragment obtained after PGR was straightened out by Bam HI 
and Hindlll and sub-cloned in the expression vectors pET28C and 
pET21C (Novagen) straightened by Bam HI and Hind III. The DNA 
sequencing of the fragment of 1077 pb of the clone 2 in the two 
expression vectors was carried out according to the method 
recommended for the use of the sequencing kit ''PRISM® Ready 
Reaction Amplitaw® FS, DyeDeoxy(E) Terminator" (Applied Biosystems, 
ref , 402119) and the automatic sequencing was carried out on the 
devices 373 A and 377 Applied Biosystems, according to the 
instruction of the manufacturer. 

The expression of the nucleotide sequence of the fragment of 
1077 pb of the clone 2 by the expression vectors pET28C and pET21C 
are identified by SEQ ID NO: 135 and SEQ ID NO: 137 respectively. 

EXAMPLE 6: EXPRESSION OF CLONE 2 IN ESCHERICHIA COLI 

The constructions pET28c-clone 2 (1077 pb) and pET21C-clone 2 
(1077 pb) synthesize, in the bacterial strain BL21 (DE3), a protein 
with N- and C-terminal fusion for the vector pER28C and C-terminal 
for the vector pET21C with 6 histidines, with apparent molecular 
weight of about 45 kDa, demonstrated by polyacrylamide gel 
electrophoresis SDS-PAGE (SDA = Docecyl sodiiim sulfate) (Laemmli, 
1970 (1) . The reactivity of the protein was demonstrated vis-a-vis 
an anti-Histidine monoclonal antibody (Dianova) by the Western 722. 
blot technique (Towbin, et al . , 1979 (2)). 

The recombinant proteins pET28C-clone 2 (1077 pb) and pET21C- 
clone 2 (1077 pb) were displayed in SDS-PAGE in the insoluble 
fraction after enzymatic digestion of the bacterial extracts with 
50 IU.1 of lysozyme (10 mg/ml) and ultrasound lysis. 

The antigen properties of the recombinant antigens pET28C- 
clone 2 (1077 pb) and pET21C-clone 2 (1077 pb) were tested by 
Western Blot technique () after solubilization of the bacterial 
residue with 2% SDS and 50 mM of beta-mercaptoethanol . After 
incubation with the sera of patients afflicted with multiple 
sclerosis, the sera of the neurological controls and the reference 
sera of the blood transfusion center (CTS) , the immune complexes 
were detected by the help of anti-IgO goat serum and human anti- 
AgM, coupled with alkaline phosphatase. 

The results are presented in the following table. 
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TABLE 

Reactivity of sera afflicted with multiple sclerosis and references 
with the recombinant protein MSRV-1 gag clone 2 (1077 pb) = pET21C- 
clone 2 (1077 pb) and pET28C-clone 2 (1077 pb)» 



Disease 


Number of Tested 
individuals 


Number of Positive 
Individuals 


MS 


15 


6 






2(+++) ,2(++) , 2( + ) 


Neurological 






References 


2 


1(+++) 


Healthy 






References (CTS) 


22 


1(+/-) 



a) The small bands that contain 1.5 microgram of recombinant 
antigen pET-gag clone 2 have a reactivity against sera diluted to 
1/100. The Western Blot interpretation is based on the presence or 
the absence of a band pET-gag clone 2 (1077 pb) specific on the 
bands. Some positive and negative controls are included in each /iQ 
experiment . 

These results show that, under the technical conditions used, 
about 40% of the human sera afflicted with multiple sclerosis that 
were tested react with the recombinant proteins pET28C-clone 2 
(1077 pb) and pET21C-clone 2 (1077 pb) . A reactivity was observed 
for a neurological reference and it is interesting to note that the 
RNA extracted from this serum, after the reverse transcriptase 
stage, are also amplified by PCR in the pol region. This suggests 
that persons who have not been declared as having MS can also 
harbor and express this virus. On the other hand, an apparently 
healthy reference sample (CTS donor) has some anti-gag (clone 2, 
1077 pb) antibodies. This is compatible with acquired immunity 
against MSRV~1 in addition to a declared associated auto-immune 
disease . 

EXAMPLE 7 : PRODUCTION OF A CLONE LB13 THAT CONTAINS AT 3' ONE PART 
HOMOLOGOUS TO THE CLONE 2 CORESPONDING TO THE GENE GAG 
AND AT 5' ONE PART HOMOLOGOUS TO THE CLONE 5M6 
CORRESPONDING TO THE REGION LTR U5 . 

One RT-PCR (''reverse transcriptase polymerase chain reaction) 
was carried out from the total RNA extracted from virions that came 
from surface fluids of lymph B cells of patients afflicted with 
multiple sclerosis, concentrated by ultra centrifugings , The 
synthesis of cDNA was carried out with a specific primer SEQ No XXX 
and the reverse transcriptase "Expand®) RT" of Boehringer Mannheiv 
according to the conditions recommended by the company. 
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Primer used for the synthesis of the cDNA, identified by SEQ / 31 
ID No: 138: 

5' CTT GGA GGG TGC ATA ACC AGG GAA T 3' 

An amplification by PGR was carried out with the Taq 
polymerase (Perkin Elmer, France) under the following conditions: 
94 °C for 1 minute, 55 °C for 1 minute, 72 °C for 2 minutes during 
35 cycles and 72 °C for 7 minutes and with a final reaction volume 
of 100 //I. 

Primers used for the amplification by PGR: 

■ Primer 5', identified by SEQ ID No: 139 

■ 5 ' TGT COG CTG TGC TCC TGA TO 3 ' 

■ Primer 3', identified by SEQ ID No: 138 

» 5' CTT GGA GGG TGC ATA ACC AGG CAA T 3' 

A second so-called ''semi-strain" amplification by PGR was 
carried out with a primer 3' located inside the region already 
amplified. This second amplification was carried out under the 
same experimental conditions as those used during the first 
amplification, using 10 //I of the amplification product derived 
from the first PGR. 

Primers used for the amplification by "semi-strain" PGR: 

■ Primer 5', identified by SEQ ID No: 139 

■ 5' TGT CCG CTG TGC TCC TGA TC 3' 

■ Primer 3', identified by SEQ ID No: 140 

■ 5' CTA TGT CCT TTT GGA CTG TTT GGG T3 ' 

The primers SEQ ID No: 138 and SEQ ID NO: 140 are specific for 
the region gag, clone 2 nucleotide position No. 373-397 and No. 
433-456. The primers used in the region 5' were defined on some 
sequences of clones obtained during prior tests. 

An amplification product of 764 pb was obtained and cloned in 
the following way: 

The amplified DNA was inserted in a plasmid by the help of the 
TA Cloning® kit. Two //I of DNA solution were mixed with 5 //I of 
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sterile distilled water, 1 yul of a ligation buffer concentrated ten 
times ^^lOX Ligation Buffer," 2 //I of "pCR® VECTOR" (25 ng/ml) and / 32 
1 fu.1 of "T4 DNA Ligase." This mixture was incubated overnight at 
14 °C, The following stages were carried out in conformity with 
the instructions of the TA Cloning® kit (Invitrogen) . The mixture 
was spread out after transformation of the ligation is some E. coli 
bacteria INVaF' . At the end of the procedure, the white colonies 
of recombinant bacteria were subcultured to be cultivated and to 
allow the extraction of the plasmids incorporated according to the 
procedure called ''mini-preparation of DNA" (17) . The plasmid 
preparation of each recombinant colony was cut by the restriction 
enzyme Eco RI and analyzed on agar gel . The plasmids that have an 
insert detected under ultra-violet light after marking of the gel 
with ethidium bromide were selected for the sequencing of the 
insert, following hybridization with a complementary primer of the 
promoter T7 present on the cloning plasmid of the TA cloning kit®. 
The reaction prior to the sequencing was then carried out according 
to the method recommended for the use of the sequencing kit "PRISM® 
Ready Reaction Amplitaq® FS, DyeDeoxy® Terminator" (Applied 
Biosystems, ref . 402119) and the automatic sequencing was carried 
out on the devices 373 A and 377 Applied Biosystems, according to 
the instructions of the manufacturer. 

The resulting clone LB13 contains a N-terminal region of the 
gene gag MSRV-1 homologous to the clone 2 and an LTR corresponding 
to one part of the U5 region. Between the U5 region and gag a 
fixation site for the transfer RNA, the PBS "primer binding site" 
was identified . 

The nucleotide sequence of the fragment of 764 pb of the clone 
LB13 in the plasmid "pCR® vector" is represented in the identifier 
SEQ ID No: 141. 

The fixation site for the transfer RNA, which has a sequence 
of the PBS tryptophan type, was identified in the nucleotide /2A 
position No. 342-359 of the clone LB13 . 

Another clone, named LAI 5, was obtained on the total RNA 
extracted from virions concentrated by ultra-centrifuging from a 
culture surface fluid of synoviocytes derived from a patient 
afflicted with rheumatoid arthritis. The strategy of amplification 
and cloning of the clone LA15 is exactly the same that was used for 
the clone LB13 . 

The nucleotide sequence of the clone LA15 that is represented 
in the identifier SEQ ID No: 142 is very similar to the clone LB13 . 
This suggests that the retrovirus MSVR-1 detected in multiple 
sclerosis has some sequences similar to those encountered in 
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rheumatoid arthritis. 
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[LIST OF SEQUENCES] 

Pp. 35-47 are not translated as indicated on the original. 
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CLAIMS 



1. Nuclear material, in the isolated or purified state, which 
includes a nucleotide sequence chosen in the group that consists of 
(i) the sequences SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 117, 
SEQ ID No: 120, SEQ ID No: 124, SEQ ID No: 130, SEQ ID No: 141, and 
SEQ ID No: 142; (ii) the complementary sequences of the sequences 

(i) ; and (iii) the sequences equivalent to the sequences (i) or 

(ii) , in particular the sequences that have for all of their 100 
contiguous monomers, at least 50%, and preferably at least 70% 
homology with the sequences (i) or (ii) respectively. 

2. Nuclear material, in the isolated or purified state, which 
codes for a polypeptide that has, for the entire contiguous series 
of at least 30 amino acids, at least 50%, and preferably at least 
70% homology, with a peptide sequence chosen in the group that 
consists of SEQ ID No: 113, SEQ ID No: 115, SEQ ID No: 118, SEQ 
ID No: 121, SEQ ID No: 135, and SEQ ID No: 137. 

3. Retroviral nuclear material in which the gene pol includes 
a nucleotide sequence identical to or equivalent to a sequence 
chosen in the group that consists of SEQ ID No: 112, SEQ ID No: 
124, and their complementary sequences. 

4. Retroviral nuclear material in which the end 5' of the gene 
pol begins at the nucleotide 1419 of the SEQ ID No: 130. 

5. Retroviral nuclear material in which the gene pol codes for 
a polypeptide that has, for the entire contiguous series of at 
least 30 amino acids, at least 505, and preferably at least 70% 
homology with the peptide sequence SEQ ID No: 113, 

6. Retroviral nuclear material in which the end 3' of the gene 
gag ends at the nucleotide 1418 of the SEQ ID No; 130. 

?• Retroviral nuclear material in which the gene env includes 7 49 
a nucleotide sequence identical to or equivalent to a sequence 
chosen in the group that consists of SEQ ID No: 117, and its 
complementary sequences . 

8. Retroviral nuclear material in which the gene env includes 
a nucleotide sequence that begins at the nucleotide 1 of SEQ ID NO: 
117 and ends at the nucleotide 233 of SEQ ID No: 114. 

9. Retroviral nuclear material in which the gene env codes for 
a polypeptide that has, for the entire contiguous series of at 
least 30 amino acids, at least 50% and preferably at least 70% 
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homology with the sequence SEQ ID No: 118, 

10. Retroviral nuclear material in which the region U3R of LTR 
3' includes a nucleotide sequence that terminates at nucleotide 617 
of SEQ ID No: 114. 

11. Retroviral nuclear material in which the region RU5 of LTR 
5' includes a nucleotide sequence that begins at nucleotide 755 of 
SEQ ID NO: 120 and ends at nucleotide 337 of SEQ ID No: 141 or SEQ 
ID No: 142. 

12. Retroviral nuclear material that includes a sequence that 
begins at nucleotide 755 of SEQ ID No: 120 and that terminates at 
nucleotide 617 of SEQ ID No: 114. 

13. Retroviral nuclear material according to any of the 
preceding claims characterized in that it is associated with at 
least one auto-immune disease such as multiple sclerosis or 
rheumatoid arthritis. 

14. Nucleotide fragment that includes a nucleotide sequence 
chosen in the group that consists of (i) the sequences SEQ ID No: 
112, SEQ ID No: 114, SEQ ID No: 117, SEQ ID No: 120, SEQ ID No: 
124, SEQ ID No: 130, SEQ ID No: 141, and SEQ ID No: 142; (ii) the 
complementary sequences of the sequences (i) ; and (iii) the 
sequences equivalent to the sequences (i) or (ii) ^ in particular 
the sequences that have for the entire series of 100 contiguous 
monomers, at least 50%, and preferably at least 70% homology with 
the sequences (i) or (ii) respectively. 

15. Nucleotide fragment according to Claim 14 consisting of a 
nucleotide sequence chosen in the group that consists of (i) the 
sequences SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 117, SEQ ID 
No: 120, SEQ ID No: 124, SEQ ID No: 130, SEQ ID No: 141, and SEQ ID 
No: 142; (ii) the complementary sequences of the sequences (i) ; and 
(iii) the sequences equivalent to the sequences (i) or (ii) , in 
particular the sequences that have for the entire series of 100 
contiguous monomers, at least 50%, and preferably at least 70% 
homology with the sequences (i) or (ii) respectively. 

16. Nucleotide fragment that includes a nucleotide sequence 
that codes for a polypeptide that has, for the entire contiguous 
series of at least 30 amino acids, at least 50%, and preferably at 
least 70% homology, with a peptide sequence chosen in the group 
that consists of SEQ ID No: 113, SEQ ID No: 115, SEQ ID No: 118, 
SEQ ID No: 121, SEQ ID No: 135, and SEQ ID No: 137. 
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17. Nucleotide fragment according to Claim 16 consisting in a 
nucleotide sequence that codes for a polypeptide that has, for the 
entire contiguous series of at least 30 amino acids, at least 50%, 
and preferably at least 70% homology, with a peptide sequence 
chosen in the group that consists of SEQ ID No: 113, SEQ ID No: 
115, SEQ ID No: 118, SEQ ID No: 121, SEQ ID No: 135, and SEQ ID No: 
137. 

18. Nuclear probe for the detection of a retrovirus associated 
with multiple sclerosis and/or rheumatoid arthritis, characterized 
in that it is capable of being hybridized specifically on any 
fragment according to any of Claims 14 to 17, which belong to /hX 
the genome of the said retrovirus. 

19. Probe according to Claim 18 characterized in that it has 
from 10 to 100 nucleotides, preferably from 10 to 30 nucleotides. 

20. Primer for the amplification by polymerization of RNA or 
DNA of a retrovirus associated with multiple sclerosis and/or 
rheumatoid arthritis characterized in that it includes a nucleotide 
sequence identical or equivalent to at least one part of the 
nucleotide sequence of a fragment according to any one of Claims 8 
to 11, especially a nucleotide sequence that has for any series of 
10 contiguous monomers, at least 50%, and preferably at least 70% 
homology with at least the said part of the said fragment. 

21. Primer according to Claim 20 characterized in that its 
nucleotide sequence is chosen among SEQ ID No: 116, SEQ ID No: 119, 
SEQ ID No: 122, SEQ ID No: 123, SEQ ID No: 126, SEQ ID No: 127, SEQ 
ID No: 128, SEQ ID No: 129, SEQ ID No: 132, and SEQ ID No: 133. 

22. RNA or DNA, and especially the replication and/or 
expression vector, which includes a genomic fragment of the nuclear 
material according to any of Claims 1 to 7 or a fragment according 
to any of Claims 14 to 17. 

23. Peptide coded by any open reading frame that belongs to a 
nucleotide fragment according to any of Claims 14 to 17, especially 
a polypeptide, an oligopeptide for example that forms or includes 
an antigen determinant recognized by the sera of patients infected 
by the virus MSRV-1, and/ or in which the virus MSRV-1 and been 
reactivated, 

24. Peptide according to Claim 23 that includes a sequence 
identical to, partially or completely, or equivalent to a sequence 
chosen among SEQ ID No: 113, SEQ ID No: 115, SEQ ID No: 118, SEQ 
ID No: 121, SEQ ID No : 135, and SEQ ID No: 137, 
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25* Prophylactic or therapeutic diagnostic compound, 
especially for inhibiting the expression of at least one retrovirus 
associated with multiple sclerosis and/or rheumatoid arthritis, 
which includes a nucleotide fragment according to any of Claims 14 
to 17. 

26. Process for detecting a retrovirus associated with 
multiple sclerosis and/or rheumatoid arthritis, in a biological 
sample, characterized in that one puts in contact an RNA and/or a 
DNA presumed to belong to or come from the said retrovirus, or 
their complementary RNA and/or DNA, with a compound that includes a 
nucleotide fragment according to any of Claims 14 to 17. 

[Figure 1] 
Key: 

AND PROVIRAL=retro viral DNA; 

ARN GENOMIQUE (VIRION) =genomic RNA (virion). 

[Figures 2] to [Figure 11] 
Lists of gene sequences; 
(suite) = (continuation) . 
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GJYUmCAA QGAOOGCIAG TATOOaCTAA iraDCTLUIOG GAAAOGAAQC 50 
AYRR TPS MG. SPLG NQA 
LIE GPLV WGN PLW ETKP 
L.K DP. YGVI PSG KPS 

CCXZ^^GTACrc AGCAGGAAAA ATAGAAIAOG AAACTTCACA AOGACATACT 100 
PVL SRKN RIG NLT RTYF 
QYS AGK lE.E TSQ GHT 
PSTQ QEK .NR KPHK DIL 

TIXXTCXXDCJr ailAGAaXXTT AQCQOGAG GAAf^^ 150 

PPL QMA SH.G RKN TFT 
FLPS RWL ATE EGKI LSP 
SSP PDG. PLR KEK YFHL 

T3CAGCTAAC CAACAGAAAT TACTIAAAAC CCTICAC3CAA AOCTTOCACT 200 
CS.P TEI T.N PSPN LPL 
AAN QQKL LKT LHQ TFHL 
QLT NRN YLKP FTK PST 

TAG3CATIGA TAGCADQCAT CAGATOGOCA AATIATEATr TACIGGAOCA 250 
RH. .HPS DGQ III YWTR 

GID STH QMAK LLF TGP 
.ALI API RWP NYYL LDQ 

GGCLTlTiCA AAACIATCAA GAAGATAGIC A03Q3CIUIG AAGIUIQQCA 300 

PFQNYQEDSQGL. SVP 
GLFK TIK KIV RGCE VCQ 
AFS KLSR R.S GAV KCAK 

AAGAAAIAAT 310 
K K . 
R N N 
E I 
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CCCIGTATCT TTAACCIDCT TGflTAAGnT GTCTCYICCA GAATCAAAAC 50 
PCIF NLL VKF VSSR IKT 
PVS LTSL LSL SLP ESKL 
LYL .PPC.VCLFQNQN 

TCTAAAACEA. CAAAli'iUi'lC TICAAATOGA QCACCAGATC GAGICCATCA 100 
VKL QIVL QME HQM ESMT 
.NY KLF FKWS TRW SP. 
CKTT NCS SNG APDG VHD 

CIAAGATOCA CCGflGGAOX: CIGGAai3GC CIQCTAGCrx: AIQCTOCXSAT 150 

KIH RGP LDRP ASP CSD 
LRST VDP WTG LLAH APM 
.DP PWTP GPA C.P MLRC 

GTEAATGACA TIGAAGGCAC CCCTOOOGAG GAAATCICAA CIQCACAACC 200 
VNDI EGT PPE EIST AQP 
LMT LKAP LPR KSQ LHNP 
. .H .RH PSRG NLN CTT 

CCTACIMGC (DOCAATICAG OGGGAAGCAG TEAGAGCGGT CAICAGCTm 250 
LLC PNSA GSS .SG HQPT 
YYA PIQ REAV RAV ISQ 
PTMP QFS GKQ LERS SAN 

CCIDOCCAAC AGCACnOQG TTTKriUTT GAGAGQQGGG ACTGAGPGAC 300 

SPT ALG FSC. EGG LRD 
PPQQ HLG FPV ERGD .ET 
LPN STWV FLL RGG TERQ 

PGCPCIPOOT aGATTTGCEA GGCXIAADGAA GAATCCXTTAA GOCTAGCTOG 350 
RTSW IS. ANE ESLS LAG 
GLA GFPR PTK NP. A.LG 
D.L DFL GQRR IPK PSW 
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QW3G?IGACr GCATDCAOCT CEAAACATOG QQCITCCAAC TEAGCTCACA 
KVT ASTS KHG ACN LAHT 
R-LHPPLNMGLAT LT 
EGDC IHL .TW GLQL SSH 

0003AOGAAT CAfflGAOCTC ACIAAAATQC TAATEAGGCA AAAAIAOSG 450 

RPIRELTKMLIRQK E 
PDQS ESS LKC .LGK NRR 
PTN QRAH .NA N.A KIGG 

GTAAAGAAAT AGCXSATCAT CIATIGCriG AGAGCACAGC QQGAQQGACA 500 
^^K - YCL RAQR EGQ 

•RN SQSS lA. EHS GRDK 
KEI ANH LLPE STA GGT 

AQSOOGGGA TA^AAAOTA GOCATia^ QCX3aCAA0GG CAACXrCCIT 550 
GSG YKPR HSS RQR QPPL 
DRD INP GIRA GNG NPL 
RIGI .TQ AFE PATA TPF 

IQOSIDOOCT OOCITIGIAT G3333CICIG TTITCACICr AnTCACICT 600 

GPL PLY GRSV FTL FHS 
WVPS LCM GAL FSLY FTL 
GSP PFVW ALC FHS ISLY 

ATTAAAICrr GCAACIGAAA AAAAAAAAAA AAAAA 635 
IKSC N.K KKK K 
LNL ATEK KKK K 
• IL QLK KKKK K 
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ATOGOOCTOC CriATCATAC TITTCIUITr ACIGTIUIUT TACOCXXTTT 50 
MALP YHT FLF TVLL PPF 
WPS LIIL FSL LFS YPLS 
GPP LSY FSLY CSL TPF 

CQClCICACr GCACQCXXTC CAIQCIQCIG TACAACCAGT AQCiarCIT 100 
ALT APPP CCC TTS SSPY 
LSL HPL HAAV QPV APL 
RSHC TPS MLL YNQ. LPL 

ACOVAS^GTT TCTAIGAAGA ACQCaacrrc 150 

QEF L.R TRLP GNI DAP 

TKSF YEE RGF LEIL MPH 

PRV SMKN AAS WKY .CPI 

TCATAmSGA CrmCATCTAA GGGAAACTOC ACXHTCACIG GOCACACXXA 200 
SYRS LSK GNS TFTA HTH 
HIG VYLR ETP PSL PTPI 
I.E FI. GKLH LHC PHP 

TAT3QCXXX3C AACIQCIATA ACIUIGCCAC TCTTIGCAIG CAT3CAAAIA 250 
MPR NCYN SAT LCM HANT 
CPA TAI TLPL FAC MQI 
YAPQLL. LCHSLHACKY 

CTICATEATIG GACAGGGAAA ATGATEAATC CIAUi'iUiOC TOGAGGACPr 300 

HYW TGK MINP SCP GGL 
LIIG QGK .LI LVVL EDL 
SLL DREN D.S .LS WRTW 

GGAGOCACIG TL'lUi'iGGAC TTACITCACC CATAOCAGTA 1GICIGAT3G 350 
GATV CWT YFT HTSM SDG 
EPL SVGL TSP IPV CLMG 
SHC LLD LLHP YQY V.W 
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AOCICACCTG TCEAAAAnr AGCAAIACTA TAGACACAAC CAGCTOOCAA 750 

LTC VKF SNTI DTT SSQ 
TSPV .NLAIL .TQPAPN 
PHL CKI. QYY RHN QLPM 

TCCAICAGCJr G3GrrAACACC TGOCACADGA AIAGICTOCX: TAOOCTCAGG 800 
CIRW VTP PTR IVCL PSG 
ASG G.HL PHE .SA YPQE 
HQV GNT SHTN SLP TLR 

AATAnrnr gicigiqgia ccicagocta tcatigittg aatqqcicit 850 

IFF VCGT say HCL NGSS 
YFL SVV PQPI IV. MAL 
NIFC LWY LSL SLFE WLF 

CAGAATCTAT GIGCTIOCIC TCATICnAG TOCOOanAT GAOCAICEAC 900 

ESM CFL SFLV PPM TIY 
QNLC ASS HS. CPL. PST 
RIY VLPL ILS APY DHLH 

ACIGAACAAG AnTAIACAA ICATCKXJEA CCEAAGOOCC ACAACAAAAG 950 
TEQD LYN HVV PKPH NKR 
LNK lYTI MSY LSP TTKE 
.TR FIQ SCRT .AP QQK 

AGEAOOCATT CnDCmTG TEATCAGAGC AGGAGIGCEA G3CAGACTAG 1000 
VPI LPFV IRA GVL GRLG 
YPF FLL LSEQ EC. AD. 
STHS SFC YQS RSAR QTR 

GTACTGGCAT T3GCAGEATC ACAADCIUEA CICAGTICEA CTACAAACTA 1050 

TGI GSI TTST QFY YKL 
VLAL AVS QPL LSST TNY 
YWH WQYH NLY SVL LQTI 
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TCTCAAGAAA TAAAIQGnXSA CAIQSAACAG GICACT3ACT CCCTGGICPC 1100 
SQEI NGD MEQ VTDS LVT 
LKK .MVT WNR SLT PWSP 
SRN KW. HGTG H.L PGH 

CrroCAASa' CAACTEAACT OQCTAGCAGC AGrCAGTOCTr CAAAAIDGAA 1150 
LQD QLNS LAA VVL QNRR 
CKI NLT P.QQ .SF KIE 
LARS T.L PSS SSPS KSK 

GAGcnmsA crrocEAAcx: oocaaaagag ggggaaccig TrrArnrm 1200 

ALD LLT AKRG GTC LFL 
EL.T C.P PKE GEPV YF. 
SFR LANR QKR GNL FIFR 

GGM3AAGAAC GCiUl'lMTA IGTEAATCAA TOCAGAATTO TCACHS^GAA 1250 
GEER CYY VNQ SRIV TEK 
EKN AVIM LIN PEL SLRK 
RRT LLL C.SI QNC H.E 

AGTEAAAGAA ATIOGAGATC GAATACAAIG TAGAGCAGAG GAQCITCAAA 1300 
VKE IRDR IQC RAE ELQN 
LKK FEI EYNV EQR SFK 
S.RN SRS NTM .SRG ASK 

ACAOQGAACG CIGQGGQCK: CICAGCXIAAT GGATOCCCIG QGnCIOOOC 1350 

TER WGL LSQW MPW VLP 
TPNA GAS SAN GCPG FSP 
HRT LGPP QPM DAL GSPL 

TICITAGGAC CICEAQCAGC TXnAATATTG TIACTOCTCT TIGGAOCXTIG 1400 
FLGP LAA LIL LLLF GPC 
S.D L.QL .YC YSS LDPV 
LRT SSS SNIV TPL WTL 
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TAIUnTAAC CTCL'i'iUi'iA AGriTIUIUn: TTOCAGAATT GAAGCIGEAA 1450 
IFN LLVK FVS SRI EAVK 
SLT SLL SLSL PEL KL. 
YL.P PC. VCL FQN. SCK 

AQCTACAGAT GUiUl'lACAA ATOGAACCXX A 1481 

LQM VLQ MEP 
SYRW SYK WNP 
ATD GLTN GTP 
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TCAAAATOGA AGAQCnTAG ACTIQCEAAC CXXTAAAAGA. QQG3S^A0CT 50 
SKSK SFR LAN RQKR GNL 
QNR RALD LLT AKR GGTC 
KIE EL. TC.P PKE GEP 

GrnATmr aggggaagaa TOCiGiTAGrr mgtiaatca aiciggaaic lOO 

FIF RGRM LLV C.S IWNH 
LFL GEE CC.Y VNQ SGI 
VYF. GKN AVS MLIN LES 

ATTACIGAGA AAGTEAAAGA AATTIGAGAT OGAAmiAAT GEAGAGCAGA 150 

Y.E S.R NLRS NIM .SR 
ITEK VKE I.D RI.C RAE 
LLR KLKK FEI EYN VEQR 

GGAOCJnCAA AACACIQCAC CCIQQQGOCT OCICAGQCAA 1GGA3GQ0CT 200 
GPSK HOT LGP PQPM DAL 
DLQ NTAP WGL LSQ WMPW 
TFK TLH PGAS SAN GCP 

GGACICTOGC LTiLTJAQGA CCICIAQCAG CIMAATATT TITACIOCTC 250 
DSP LLRT SSS YNI FTPL 
TLP FLG PLAA IIF LLL 
GLSP S.D L.Q L.YF YSS 

TTIGGACOCr GTATUnCAA Ci'lULTlUi'l' AAGITIGICr CTTOCAGAAT 300 

WTL YLQ LPC. VOL FQN 
FGPC IFN FLV KFVS SRI 
LDP VSST SLL SLS LPEL 

ICAAGCIGIA AAGCTEACAAA TAUi'lL'i'iLA AATDGAAOCC CAGAIOCAGT 350 
. SCK ATN SSS NGTP DAV 
EAV KLQI VLQ MEP QMQS 
KL. SYK .FFKWNPRCS 
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QCATGACEAA AATCIAOOGT QGftOOOCTOG ADOQGCCIGC TAGACEA.T3C 400 
HD. NLPW TPG PAC .TML 
MTK lYR GPLD RPA RLC 
P.LK STV DPW TGLL DYA 

TCIGATGriA ATCACAnm AGTICACDOCT CQOGAGGAAA lUICAACHGC 450 

.C. .H. SHPSRGNLNC 
SDVN DIE VTP PEEI STA 
LML MTLK SPL PRK SQLH 

ACAAQOOCIA CIACACTCCA ATICAGIMG AAGCAGITAG AGCAGTIGIC 500 
TTPT TLQ FSR KQLE QLS 
QPL LHSN SVG SS. SSCQ 
NPY YTP IQ.E AVR AVV 

AGOCAAOCIC COCAACAGEA CnGQGrnT QCIGTIGftGA GGGIQGACIG 550 
ANL PNST WVF LLR GWTE 
PTS PTV LGFS C.E GGL 
SQPP QQY LGF PVER VD. 

AGAGACAGGA CEAGCTOGAT TIDCIAGGCr GACIAAGAAT OXNAAGOCT 600 

RQD .LD FLG. LRI PKP 
RDRT SWI S.A D.ES XSL 
ETG LAGF PRL TKN PXAX 

ANCIG3GAAG GIGADOGCAT CCATCTITAA ACATGGQGCT TOGAACITAG 650 
XWEG DRI HL. TWGL QLS 
XGK VTAS IFK HGA CNLA 
LGR .PH PSLN MGL AT. 

CTCACACCOG ACCAAIO^ GAQCICACTA AAAT3CTAAT CAQQCAAAAA 700 
SHP TNQR AH. NAN QAKT 
HTR PIR ELTK MLI RQK 
LTPD QSE SSL KC.S GKN 
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CAQC^saJCAA AQCAATAGX AATCAICIAT TCCCDGAGAG CACAOOGQGA. 750 

GGK AIA NHLL PES TAG 
QEVK Q.P IIY CLRA QRE 
RR. SNSQ SSI A.E HSGK 

AGGACAAGGA TTO9GAIAIA AACICAGGCA TTCAAGCCAG CAACAGCAAC 800 
RTRI GI. TQA FKPA TAT 
GQG LGYK LRH SSQ QQQP 
DKD WDI NSGI QAS NSN 

QOOCTTTQQG TCCXTTOOGA TIGEAIGGS^ QCIUIUl'i'i'i' CACICIMTT 850 
PEG SPPI VWE LCF HSIS 
PLG PLP LYGS SVF TLF 
PLWV PSH CMG ALES LYE 

CACIUrATEA. AATCAIQCAA CIGCMJKTT CIQG?KXCriG 'i'i'I'i'i'imOG 900 

LY. IMQ LHSS GPC FLW 
HSIK SCN CTL LVRV FYG 
TLL NHAT ALF WSV FFMA 

CICW«3CIGA GCnTIGnC GOCA-nrACC ACIGCIGTIT Q0CACXD3ICA 950 
LKLS FCS PST TAVC HRH 
SS. AFVR HPP LLF ATVT 
QAE LLF AIHH CCL PPS 

CAGADODGCr GCIGACTIU: ATCOCmGG ATCCAQCAGA GIGIDCACIG 1000 
RPA ADFH PEG SSR VSTV 
DPL LTS IPLD PAE CPL 
QTRC - LP SLW IQQS VHC 

TGCTOC'IGAT CriAGCCAGGT ACX^CATIGCX: ACTCCCEATC AGGCEAAAGG 1050 

LLI QRG THCH SRS G.R 
CS.S SEV PIA TPDQ AKG 
APD PARY PLP LPI RLKA 
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CriaXAITC TIDCTOCA3G GCTAPOVGCC TGCGITTGIC CEAATAGAAC 1100 
LAIV PAW LSA WVCP NRT 
LPL FLHG .VP GFV LIEL 
CHC SCM AKCL GLS . .N 

TGAACACTQG TCACIQGGriT CCKTCGYTCT CrroCAHTGAC CTAOaGCTTC 1150 
EHW SLGS MVL FHD PRLL 

NTG HWV PWFS SMT HGF 
. TLV TGF HGS LP.P TAS 

TAAaaGAQCT ATAACACICA (D0QCAIO3QC CAAGATIDCA 'i'iLL'i'iUJiA 1200 

lEL .HS PHGP RFH SLV 
. .SY NTH RMA QDSI PWY 
NRA ITLT AWP KIP FLGI 

TCIGIGAGQC CAAGAACnDC AG3ICAGAGA ANGJIGAGGCT TOQCACCATr 1250 
SVRP R. TP GQR X.GL PPF 
L.G QEPQ VRE XEA CHHL 
CEA KNP RSEX VRL ATI 

TOGGAAGfIGG OXACIGOCA 'i'iTiUGIAGC GOOOCACCAC CATCTIQQGA 1300 
GKW PTAI LVA AHH HLGS 
GSG PLP FW.R PTT ILG 
WEVA HCH FGS GPPP SWE 



GCTCIGGGAG CAAQGATOOC OCAGEAACA 

CGS KDP PVT 
AVGA RIP Q. 
LWE QGSP SN 
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CCT;CAAOGrr atictgg^ attoos^^cca atgttgqcact OGAOXTAA 50 
PRTY SGE LGP M.HS DAK 
LER ILEN WDQ CDT QTLR 
.NV FWR IGTN VTL RR. 

GAAA3AAM3 ATITATAnC TICIOCAGrTA 033CXnX3GCC ACAATATCCT 100 
KET lYIL LQY RLA TISS 
KKR FIF FCST AWP QYP 
ERND LYS SAV PPGH NIL 

CnCAMQGA GfGAAACCIG GCVTOCKX QGAA3TATAA ATTATAACAT 150 

SRE RNL AS.G KYK L-H 
LQGR ETW LPE GSIN YNI 
FKGEKPGFLREV. IITS 

CATCTE?OG CIPGfCCICV TCTGE^GAAA QGMGGCAAA TOG^GKS^ 200 
HLTA RPL L.:< GGQM E.S 
ILQ LDLF CRK EGK WSEV 
SYS .TS SVER RAN GVK 

TGCCATATOT GCAA^CITIC TmCATIAA C^CfCPJ^CTC ACIAATIMCT 250 
AlC ANFL FIK RQL TIM. 
PYV QTF FSLR DNS QLC 
CHMC KLS FH. ETTH NYV 

AAAAPGTOIG GnTftTOOOC T?O0GA?GC CXTOGPGrC CAOCTOOCTA 300 

KVW FMP YRKP SES TSL 
KKCG LCP TGS PQSP PPY 
KSV VYAL QEA LRV HLPT 

axcpcrxjic occtoocxjsa ciaji ' icc i c APciiN^ns^ gsoixxX'tt 350 

PQRP LPD SFL N. .G PPF 
PSV PSPT PSS TNK DPPL 
PAS PPR LLPQ LIR TPL 

rPsPCXXMPC GG?IDCAAAM GPGftmSsCA AMGGGIAAA CAAIGAMCA 400 
NPN GPKG DRQ RGK Q.TK 

TQT VQK EIDK GVN NEP 
.PKR SKR R.T KG.T MNQ 

AfiCfiGTGCCPi A'mnCCCCG ATEftTOCXXr TGPGPOMG 450 

ECQ YSP IMPP PSS ERR 
KSAN IPR LCP LQAV RGG 
RVP IFPD YAP SKQ ,EEE 

;^GAATICGGC CC?OCCPCPG lOOCIGTPCC TnTTCIUrC TCPGftCTIAA 500 
RIRP SQS ACT FFSL RLK 
EFG PARV PVP FSL SDLK 
NSA QPE CLYL FLS Q T . 
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AGCAPATTPA AATAGAOTTA OGrTAAAriCT CPGATAaOCC TCACGQCTAT 550 
AN, NRPR .IL R.P .RLY 
QIK IDL GKFS DNP DGY 
SKLK .T - VMS QITL TAX 

ATIGATCrnT TPCPJ^XnrV A3CyCAA10C TTTGATCTGA CAT33?G?^ 600 

. CF TRV RTIL .SD MER 
IDVL QGL GQS FDLT WRD 
LMF YKG. DNP LI. HGEI 

TATAATGTTA CTACTAAATC PCN:PCTAPC OXAAATC^G AGA/^CTGCOG 650 
YNVT TKS DTN PK.E KCR 
IML LLNQ TLT PNE RSAA 
-CY Y.I RH.P QMR EVP 

CTGrCAACTGC AGCTOGft^ TTIOGOGATC TntXTTATCT CPGVCPO30C 700 
CNC SPRV WRS LVS QSGQ 
VTA ARE FGDL WYL SQA 
L.LQ PES LAI FGIS.VRP 

AACAA2!AGGA TGt<:AKJ<^ GGAA^^GAACA ACTOOCfOG OXS^CCPOX: 750 

Q.D DNR GKNN SHR PAG 
NNRM TTE ERT TPTG QQA 
TIG -QQRKEQLPQASRQ 

AGTITXCAGT GnyOOOClC ATIGGGACAC A3AATC?GAA CAroG?V3Arr 800 
SSQC RPS LGH RIRT WRL 
VPS VDPH WDT ESE HGDW 
FPV ,TL IGTQ NQN MEI 

QGrilXC?^CAA ACATITtXTA ACTO 850 

VPQ TFAN LRA RRT EEN. 

CHK HLL TCVL EGL RKT 

GATN IC. LAC .KD. GKL 

MGAflGAWC CTATCAATIA CICAATCATC 900 

EEA YEL LNDV HYN TGK 
RKKP MNY SMM STIT QGK 
GRS L, IT Q.C PL. HRER 

OGAAGAAAAT CTIJ^CIGCTT TTC^ 950 
GRKS YCF SGQ TKGG lEE 
EEN LTAF LDR LRE ALRK 
KKI LLL FWTD .GR H.G 

POCATPCCTC (XrraVCPCCT G^^CTCTATTG AASGOCA^CT AAICITAA^ 1000 
AYL PVT. LY. RPT NLKG 
HTS LSP DSIE GQL ILK 
SIPPCHLTLLKAN. S.R 
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GATAATTTTA TCPCrTCPClC PCCVOCPC^ ATTPCf^M/^ /OTCAAAA^ 1050 

.VY HSV SCRH .KK LQK 
DKFI TQS AAD IRKN FKS 
ISL SLSQ LQT LEK TSKV 

TCTCOCTTAG G0CO3G?OCA GAACTTA^ AOCCTATTTA A:mi3GCA1C 1100 
SALG PEQ NLE TLFN LAS 
LP. ARSR T.K PYL TWH? 
CLR PGA ELRN PI. LGI 

CTCAGrrmr tataata^ atc^ogagga ocAoaaGAAA cdggacaaat ii50 

SVF YNRD QEE QAK RDKR 
QFF TIE IRRS RRN GTN 
LSFL . .R SGG AGET GQT 

OQGATAAAAA AAAAM3GGG OGTrCCACTAC nTAGICAOX; GOOCTCXXX: 1200 

DKK KRG GPLL .SW PSG 
GIKK KGG VHY FSHG PQA 
G.K KKGG STT LVM ALRQ 

AMCAG?iCIT TQGPGGCrCT GCAAAAGGGA AAAGCTOQGC AAAICAAATC 1250 
KQTL EAL QKG KAGQ IKC 
SRL WRLC KRE KLG KSNA 
ADF GGS AKGK SWA NQM 

OCEAA3MC3G CVOOCTICCA GIG0G3G1CTA CAMGPOCT TTEAAAAA^iGA 1300 
LIG LASS AVY KDT LKKI 
. .G WLP VRST RTL .KR 
PNRA GFQ CGL QGHF KKD 

TrRTOCAAar agaaata;«: ooaxxx'X'iG 'KrArooooc ttaoctca;^ i350 

IQV EIS RPLV HAP YVK 
LSK. K.AAPLSMPLTSR 
YPS RNKP PPC PCP LRQG 

QGAATCPCIO GAMQCCX?C T3G0CXMGG GATCAAGATA CTCIO^GTCA 1400 
GITG RPT APG DEDT LSQ 
ESL EGPL PQG MKI L.VR 
NHW KAH CPRG .RY SES 

GAA3CCATEA ACC?GATCAT CCAGCAGC?^ GPCIGMQCT 1450 
KPL TR,S SSR TEG ARCS 
SH. PDDPAAGLRVPGA 
EAIN QMI QQQ D.GC PGR 

PCCGCCPCXX: CAKXTAICA CCCICPOCA GOOOOGOG?EA TCnTOVXA 1500 

RQP MPS PS' OS PGY V.P 
SASP CHH PHR APGM FDH 
APA HAIT LTE PRV CLTI 
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ATOGGC^^GCA GCXiATCAICA TCATCATCAC PCJCPOOOOOC TOGrTOCCQCG 50 
MGSS HHH HHH SSGL VPR 

OaOC^iGCCAT ATOGCIMCA TCACTOGTOG X:MC:AAATG QGnCGGATCC 100 
GSH MASM TOG QQM GRIL 

TAGAA33TAT TCIGGAGAAT TQG»iXAAT GIGACACTCA GACGCTAAGA 150 
ERI LEN WDQC DTQ TLR 

AAGAA^OM' rmrATTCTT CIGCPGIPCC GOGTOGOZAC AATATCCICT 200 
KKRF IFF CST AWPQ YPL 

ICAAGGG^GA GAAACXTOGC TI0CIGA3GG AAGTIATAAAT TATAACAKA 250 
QGR ETWL PEG SIN YNII 

TCTTPCPCCr AGAOCIUnC TGTAGAA;^; A3QGCAAATC GAGIGAAGIG 300 
LQL DLF CRKE GKW SEV 

CCATATCTGC: AAACTTICIT TTCArEAAi::A GACMCICSC AATIMCTAA 350 
PYVQ TFF SLR DNSQ LCK 

AAAGnUIGGr TTATOOOCTA CAGS^AGCOC TCMAGTOCA arnXCTACC 400 
KCG LCPT GSP QSP PPYP 

CCPOC^VCCC CIOQOOGACT CX:TKrTCAA CTAATAAGGA CCCXrClTIA 450 
SVP SPT PSST NKD PPL 

AOXAAAOSG TOCAAAAGGA GAITOO^ 500 
TQTV QKE IDK GVNN EPK 

GAGTOG2AAT ATICCXXDGAT TAIOOOOOCT OCAAGCAGIG AGAGGA3GA3 550 
SAN IPRL CPL QAV RGGE 

AAITOOGOGC AGCCAGAGTG CXnGTAOCTT TTICICICIC AGACTIAAPG 600 
FGP ARV PVPF SLS DLK 

CAAATEAAAA TPG?OC:i3>GG TAAATKTTCA GATAADOCTG AD3GCTATAT 650 
QIKI DLG KFS DNPD GYI 

TGATGTrriA CAMGGITAG GACAATOCTT TCATCTGACA T3GPGAGATA 700 
DVL QGLG QSF DLT WRDI 

TAATGITOCr ACEAAATCAG ACACEAAOr CAAAIGAGAG AAGTIGOOXT 750 
MLL LNQ TLTP NER SAA 

GTAACIDC^G CTGGG^iG^GnT TOGOGAICTT TGGimCTCA GIC?iGGOCAA 800 
VTAA REF GDL WYLS QAN 
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CAATA3GATC ACAAO^G^ AAAGAACAAC TDGCAC^^GQC CAGCAGGO^ 850 
NRM TTEE RTT PTG QQAV 

TTOCCAGIGT AGAOOCTCAT TQGGACACAG AATCPGAACA TGG^TIGG 900 
PSV DPH WDTE SEH GDW 

TOOCACAAAC ATTIGCTAAC TIGOC?IGCrA GAAGGACTGA GGAAAACTAG 950 
CHKH LLT CVL EGLR KTR 

GA^^GAMOCT AOXSAAnSiCr CAATCATOIC C30M1AACA CAGGGAA^^GG 1000 
KKP MNYS MMS TIT QGKE 

AAGAAAATCT TACIGCnTT CTOGAC^GAC TAAG9GAGGC ATIGAGGAAG 1050 
ENL TAP LDRL REA LRK 

CATADCTOOC TGnOOTIGA CICEA3TGAA GOOCAACTAA TCTEAAAGGA 1100 
HTSL SPD SIE GQLI LKD 

IMGTTEATC ACICfiGTCAG CT3CSGACAT TT^GAAAAAAG TTCAAAAGIC 1150 
KFI TQSA ADI RKN FKSL 

TOGCTAAGCT TCSCGQOOGCA CTCG^^CCPCC NOCPCCPCCA CCACIGAGAT 1200 
PKL AAA LEHH HHH H.D 

CCX3GCTOCTA ACAAAGCXTG AAMGAMCT G^GITGQCIN GIGGCNA 1247 
PAAN KAR KEA ELAX G 
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ATOQCEAGCA TCACTGCJIQG ACAGCAAATC GGTOOGATCC mPAOirAT 50 
MASM TGG QQM GRIL ERI 

TCTOG^lGAAT TOGGACXIAAT GTIGAZACTCA GACXXTAAGA AA3AAACGAT 100 
LEN WDQC DTQ TLR KKRF 

TTATATlCrr CTOCAGEAO: GOCTIGGOCAC AATATCOCT TCAaGGGO^ 150 
IFF GST AWPQ YPL QGR 

GAAADCTGQC TTOTIGAGGG A^GTATAAAT TATAACATCA TCTTACMCTr 200 
ETWL PEG SIN YNII LQL 

;0CUICTK: TGrHAGAA^^GG AGQQCAAATG GAGIGAAGIG OCAIATGTQC 250 
DLF CRKE GKW SEV PYVQ 

AAAL'iTlClT TTCATEAMA GACAACICAC AATTATGEAA AA^^GIGIGGT 300 
TFF SLR DNSQ LCK KCG 

TTAIGCXXTA CMGAAGOX TCAG^CTQCA CXTCXCTADC QCPOOJVCCC 350 
LCPT GSP QSP PPYP SVP 

0100003^ CCTTOCTCAA CmATA?03A OOOOOCnTA AOCCAAADQG 400 
SPT PSST NKD PPL TQTV 

TO^^AAAGGA GM?OOAA GGGG^ 450 
QKE IDK GVNN EPK SAN 

AITOOCOSAT TAIGOOOOCT OCAAGC^GIG i^^GAGGAGG^ AATTOOaOQC 500 
IPRL CPL QAV RGGE FGP 

AGoc^GAGiG cciGTAarrr rnuKnurc agacttaa^ caaattaaaa 550 

ARV PVPF SLS DLK QIKI 
TPGACX71MG T2WVriUICA GMCAACXDCTG M3QCIMAT IGATCnTTA 600 

dlg kfs dnpd gyi dvl 

CAMOGTE?^ GACAATCCTT TGATCTGKA TQGAGAGAIA TAAICTTACT 650 
QGLG QS. F DLT WRDI MLL 

PCUW^TCPG PCPCmPCCC CAAAIG^m^ A^OTXCXSCT (OTACTOC::^ 700 
LNQ TLTP NER SAA VTAA 

QQOG^GAGIT TOQCXMCIT TOGTAIUICA GICMCSOCAA CAAEAQGATC 750 
REF GDL WYLS QAN NRM 

AZAAO^GMG AA^=^GAACAAC 800 
TTEE RTT PTG QQAV PSV 
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A3AO0CTCAT TOOG^^CAC^ AATCAGAACA TOG^TTGG T3QCACAAAC 850 
DPH WDTE SEH GDW CHKH 

ATITOCTAAC TTOCTGTOCTA GAAGGACIGA GGAAAACTOG GAAGAMCCT 900 
LLT CVL EGLR KTR KKP 

ATCAAmcr CAATCAroiC CACTA31AACA. CAGGGAAA03 AAGAAAATCT 950 
MNYS MMS TIT QGKE ENL 

TACTQCnTT CIQGACAGAC TAAQQGAQQC ATIGSGGA^^G CATACCICO: 1000 
TAF LDRL REA LRK HTSL 

TCTTCAOCTGA CICTAITCAA GQOCAACTAA TCITAAAGGA TPJ^aTTTATC 1050 
SPD SIE GQLI LKD KFI 

PCJX^aiCPG CIOZSaGACAT TAGAAAAAAC TICAAAAJIC TOOCTAACCT 1100 
TQSA ADI RKN FKSL PKL 

TCOmDGCA CIOSAGCAOC ADC^^ 1150 
AAA LEHH HHH H.D PAAN 

ACAAAQOOOG AAAGGAMCT GftGITOGCIG GIQ3CA 1186 
KAR KEA ELAG G 
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TCIOOQCIGT QCroCIGATC CAGCACAGX GC0CATIX3CC TCTCQCAATT 50 
CPLC S.S STG AHC-L SQL 
VRC APDP AQA PIA SPNW 
SAV LLI QHRR PLP LPI 

GGQCI3W«3G CTIGCCATTG TIOCIGCACA GCIAAGnX3CC TCGGTICNFC 100 
G.R LAIV PAQ LSA WVHP 
AKGLPLFLHS .VPGFI 
GLKA CHC SCT AKCL GSS 

CTAATCGAGC TGAACACEAG TCACIGOGIT CCPCGJITCT CTIOCAIGAC 150 

NRA EH. SLGS TVL FHD 
LIEL NTS HWV PRFS SMT 
. SS .TLV TGF HGS LP.P 

cx::AaQGCTic TAAEAGAGCT AIAACACTCA CIGCATOGIC CAAGATTOCA 200 
PWLL lEL .HS LHGP RFH 
HGF ..SYNTHCMVQDSI 
MAS NRA ITLT AWS KIP 

TroCTIGGAA ICOGIGfiGAC CAAGAADQCC AGGTCAGAGA ACACAAGGCT 250 
SLE SVRP RTP GQR TQGL 
PWN P.D QEPQ VRE HKA 
FLGI RET KNP RSEN TRL 

T30CA0CATC TIGGAAGCAG CCCAOCAOGA TTnOGAAGC AQC0CX30CAC 300 

PPC WKQ PTTI LEA ARH 
CHHV GSS PPP FWKQ PAT 
ATM LEAA HHH FGS SPPL 

TALLUi'iUGGA GCIUIGGGfiG CAAGGADCXr AGCICAACAAT TIGGIGACCA 350 
YLGS SGS KDP R.QF GDH 
ILG ALGA RTP GNN LVTT 
SWE LWE QGPQ VTI W.P 

OGAAGQGACC TGAATCXX3CA ACCATCAAGG GATCTOCAAA GCAATIGGAA 400 
EGT .IRN HEG IS K AIGN 
KGP ESA TMKG SPK QLE 
RRDL NPQ P.R DLQS NWK 
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MGITCCia: CAAGGCAAAA NTGCCajn^ GA-TCEATICr QGAGAATT3G 450 

VPP KAK MPLR CIL ENW 
MFLP RQK CP. DVFW RIG 
CSS QGKN APK MYS GELG 

GACCAATTIG ACCCICAGAC AGEftAGAAAA AAAIGACITA TAITCTIUTG 500 
DQFD PQT VRK K.LI FFC 
TNL TLRQ .EK NDL YSSA 
PI. PSDSKKKMTYILL 

CAGI3\CCQCC CIQQCCAD3A TAICCICTIC AAGGG33AGA AACCIQGOCT 550 
STA LATI SSS RGR NLAS 
VPP WPR YPLQ GGE TWP 
QYRP GHD ILF KGEK PGL 

CCIGAGQGAA GTATAAATEA TAACADCATC TTACAQCTAG ACCIGITnG 600 

.GK YKL .HHL TAR PVL 
PEGS INY NTI LQLD LFC 
LRE V.II TPS YS. TCFV 

TAGftAAAGGA QQCAAAXOGA GIGAAGIQGC AimTACAA ALTi'iUi'i'iT 650 
KRR QME .SA IFTN FLF 
RKG GKWS EVP YLQ TFFS 
EKE ANG VKCH lYK LSF 

CNTTM^APGA CAACIOGCAA TrATGITAAC AGIGIGATIT GIGITCCrAC 700 
IKR QLAI MLT V.F VFLH 
LKD NSQ LC.Q CDL CSY 
H.KT TRN YVN SVIC VPT 

ACGGAAGCCC TCAGAnCTA CTOQOCACOC CCX3QCATCrC OOCTGAATOC 750 
GSPQILLPTPGISPES 

TEAL RFY SPP PASP LNP 
RKP SDST PHP RHL P. IP 
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TCTOOQCIGr GCTOCTGATC CAQCACAGQC QCOCATTOOC TCTOOCAATT 50 
CPLC S.S STG AHCL SQL 
VRC APDP AQA PIA SPNW 
SAV LLI QHRR PLP LPI 

QGQCIAAAQG CTTGCCKTIG TTOCIGCACA GCTAAGIQOC IGOGTICAIC 100 
G.R LAIV PAQ LSA WVHP 
AKG LPL FLHS .VP GFI 
GLKA CHC SCT AKCL GSS 

Cn^lCGAGC TGAACftCEAG TCACTOGGIT OCACGGTICr CTICCATGAC 150 

NRA EH. SLGS TVL FHD 
LIEL NTS HWV PRFS SMT 
.SS .TLVTGFHGSLP.P 

OIMQGCnC TAAIRGAGCr AEAACACICA CIQCAIGGnX: CAAGATTOCA 200 
PWLL lEL .HS LHGP RFH 
HGF ..SYNTH CMVQDSI 
MAS NRA ITLT AWS KIP 

TIOCnOGAA TCX33IGAGAC CAAGAACCCX: AGSICAGAGA AC2CAAGGCT 250 
SLE SVRP RTP GQR TQGL 
PWN P.D QEPQ VRE HKA 
FLGI RET KNP RSEN TRL 

T3QCACCAIG TIQGAAGCAG CX)CAOCAOCA TTTIGGAAGC QGCODGOCAC 300 

PPG WKQ PTTI LEA ARB 
CHHV GSS PPP FWKR PAT 
ATM LEAA HHH FGS GPPL 

TATCTIQGGA GCICIQQGAG CAAGGADQCC CAGGTAACAA TITGGTGAOC 350 
YLGS SGS KDP QVTI W.P 
ILG ALGA RTP R.Q FGDH 
SWE LWE QGPP GNN LVT 

AOGP^mSAC CIGVODOGC AADCATCAAG QGATCTCCAA PiGCAATIOGA 400 
RRD LNPQ P.R DLQ SNWK 
EGT .IR NHEG ISK AIG 
TKGP ESA TMK GSPK QLE 
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AAIiUi'iLUiU (XAAQGCAAA ANIGCCCCJA AGA.TUIA.TTC TOGAGAATIG 450 

CSS QGK NAPK MYS GEL 
NVPP KAK MPL RCIL ENW 
MFL PRQK CP. DVF WRIG 

QGACCAAlUr GAOCCICAGA. CAGEAAGAAA AAAAAIGACT mTATICTTC 500 
GPI. PSD SKK KNDL YSS 
DQS DPQT VRK KMT YILL 
TNL TLR Q.EK K.L IFF 

T3CAGEAC0G QCIQG0CAD3 GAIAICCICT TCAAQQGGGA GAAACCIQGC 550 
AVP PGHG YPL QGG ETWP 
QYR LAT DILF KGE KPG 
CSTA WPR ISS SRGR NLA 

CICCIGAG3G AAGISm^AM' TA1?^ACA(XA 600 

PEG SIN YNTI LQL DLF 
LLRE V.I ITP SYS. TCF 
S.G KYKL .HH LTA RPVL 

TCTAGAAAAG GAQGCAAAIG GAGIGAAGIG OCATATmC AAAL'i'i'iUi'i' 650 
CRKG GKW SEV PYLQ TFF 
VEK EANG VKC HIY KLSF 
.KR RQM E.SA IFT NFL 

TICmAAAA GACAACTCGC AATEAIGIAA ACAGIGTGAT I'iG'iUlLC'lA 700 
SLK DNSQ LCK QCD LCPT 
H.K TTR NYVN SVI CVL 
FIKR QLA IM. TV.F VSY 

CAOGA?iGOOC ICAGAIUEAC CTOCTTAOCC 03GCATUIXX: CIGACICCIT 750 

GSP QIY LPTP ASP .LL 
QEAL RST SLP RHLP DSF 
RKP SDLP PYP GIS LTPS 

OCGCAACTAA TAAGGACQCA CTICAGOOCA AACAGICCAA AAGGACATAG 800 
PQLI RTH FSP NSPK GH 
FN. .GPT SAQ TVQ KDI 
PTN KDP LQPK QSK RT. 
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GGCAITCAIA GCAOCCAICA GAIGOXAAA TCATEATTm CIGGACCAQG 50 
GIDS THQ MAK SLFT GPG 
ALI APIR WPN HYL LDQA 
H. . HPS DGQI IIY WTR 

OL'i'i'i'lCAAA ACEATCAAQC AGMftGGQGC GGIGAAGCAT QOCAAAGAAA 100 
LFK TIKQ IGP VKH AKEI 
FSK LSS R.GP .SM PKK 
PFQN YQA DRA REAC QRN 

TftATCOOCIG CCrEATOGQC ATGnOCTIC AQGAGAACAA AGAACAG3CC 150 

IPC LIA MFLQ ENK EQA 
-SPA LSP CSF RRTK NRP 
NPL PYRH VPS GEQ RTGH 

ATEAOQCAQG GGAAGACIQG CAACIAGArT TTAOCCACAT GGCCAAATCT 200 
ITQG KTG N.I LPTW PNV 
LPR GRLA TRF YPH GQMS 
YPG EDW QLDF THM AKC 

CAQQGATnC AGCAICIACr AGICIOQGCA GATACTTICA CIQGTIQQGr 250 
RDF SIY. SGQ XLS LVGW 
GIS AST SLGR YFH WLG 
QGFQ HLL VWA DTFT GWV 

GGftGICnCr OCnOEAGGA CAGAAAAGAC OCAAGAQGTA ATAAAGQCAC 300 

S L L L V GQKRPKR. .RH 
GVFS L.D RKD PRGN KGT 
ESS PORT EKT QEV IKAL 

TAAIGAAATA ATIDOCAGAT TIGGACTIOC CXXAGGATEA CAGGGnXSACA 350 
. .NN SQI WTS PRIT G.Q 
NEI IPRF GLP PGL QGDN 
MK. FPD LDFP QDY RVT 
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ATOQQOGOQC TTICAAGGCr GCAGEAACCX: AGQGAGEATC CCPGSVGJTk 400 
WPR FQGC SNP GSI PGVR 
GPA FKA AVTQ GVS QVL 
MAPL SRL Q.P REYP RC. 

GGCATACAAT ATCACTEACA CIGIGOTIGG AGGCTACAAT CXTCCAGAAA 450 

HTI SLT LCLE ATI LQK 
GIQY HLH CAW RPQS SRK 
AYN ITYT VPG GHN PPEK 

AGICAAGAAA ATCAAIGAAA CACICAAAGA TCIAAAAAAG CIAACXX?^ 500 
SQEN E.N TQR SKKA NPR 
VKK MNET LKD LKK LTQE 
SRK .MKHSKI .KS .PK 

AAADOCACAT TQCAIGADCrr GnUIGTIGC CIMJAAOCIT ACIAAGAATC 550 
NPH CMTC SVA YNL TKNP 
THI A.P VLLP ITL LRI 
KPTL HDL FCC L.PY .ES 

CAIAACEATC OCXXAMAAG CAGGACrTAG OX^^^ 600 

• LS PKK QDLA HTR CYM 
HNYP PKS RT. PIRD AIW 
ITI PQKA GLS PYE MLYG 

GATOGOCITT OCIAAOCAAT GACCTIGIGC TIGACIGAGA AAIGQCCAAC 650 
DGLS .PM TLC LTEK WPT 
MAF PNQ. PCA .LR NGQL 
WPF LTN DLVL D.E MAN 

TTAGTIQCAG ACATCAOCIC CITAGOCAAA TATCAACAAG 'i'lU l 'l A AAAC 700 
•LQ TSPP .PN INK FLKH 
SCR HHL LSQI STS S.N 
LVAD ITS LAK YQQV LKT 
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ATCACAG3GA AOCTOTOOCX: GAGftGGMGG AAAGGAACIA TTOCACQCIG 750 

HRE PVP ERRE RNY STL 
ITGN LSP RGG KGTI PPW 
SQG TCPR EEG KEL FHPG 
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